Optimize and/or/xor/ldivs, combine functions that reference each other
I found an optimization for 24-bit and 32-bit and/or/xor operations which saves 2 bytes/fetches per routine. In the process, I also realized that 24-bit and 32-bit could share implementations to save space.
But, I didn't want to force the 32-bit implementations to be included in programs that didn't use them, so @jacobly0 taught me some linker magic I could use for conditional inclusion of code blocks in the same file. I went and applied this magic to each set of CRT functions which shared an implementation, to reduce JPs to JRs in some cases, or to prevent unused code inclusion in other cases.
Also, in the process of rearranging functions, I implemented a minor optimization to __ldivs for checking opposite signs.
Is there perhaps a better naming strategy we can use for files that define multiple symbols? Just joining them with an underscore is pretty confusing.
Proposals:
- Two underscores.
- A hyphen.
Yeah I'm not sure how I feel about multiple routines all in the same file - is there any way to do a 'include' instead or something?
What is the status of this PR? Perhaps some changes could be made into individual PRs that are easier to review (like one PR per function?)
Honestly, I kind of forgot about this after losing the hard drive I developed it on. I also forgot how ridiculously extensive it was, beyond the bitwise operator optimizations. I definitely feel like at least splitting it into families of functions would be a good idea.
All right I'll close it for now but feel free to open a new PR with changes :)