Overhaul option handling
Hi, me again.
After updating the performance of this project, I decided to reduce the package size even further and clean it further up by overhauling how fdir options are built with the given options and properties.
What did I do?
- Tolerate redundant checks to reduce bundle size
- Outsource types and fdir-related functions to their respective files for better project management
- Potentially improved performance by using less object property accesses
- Add forgotten performance improvement in
getPartialMatcher - Added more type definitions for better readability
This reduced the build size from
CJS dist\index.js 11.94 KB
CJS dist\index.js.map 21.24 KB
CJS ⚡️ Build success in 17ms
ESM dist\index.mjs 9.96 KB
ESM dist\index.mjs.map 21.10 KB
to
ESM dist\index.mjs 9.58 KB
ESM dist\index.mjs.map 20.59 KB
ESM ⚡️ Build success in 16ms
CJS dist\index.js 11.56 KB
CJS dist\index.js.map 20.73 KB
which is around 400b.
It seems like biome doesn't support 32 bit windows (or there is any other bug), so I can't lint and format :(
Also I have a question: Are found paths both processed in filter and exclude? If yes, there could be the possibility to cache processed paths, which would significantly improve the performance.
i haven't had the time to look at this pr, but answering your question: wouldn't work in most cases. fdir exclude paths are always absolute and filter paths aren't, plus the fact that you wouldn't even find directories in filter unless you set up the options to also receive dirs
Yeah, I tried caching anywhere and the tests started to fail. Damn...
I've realized the way tinyglobby manages it's own options was copy pasted between glob and globSync, so I tried to centralize the logic.
This also included some tiny restructurings and an additional type called Input to describe the first parameter of those two functions to increase readability. I had to make the GlobOptions type stricter and introduce a Partial<GlobOptions> for the options processing.
This change reduced the bundle size by another ~350 bytes.
ESM dist\index.mjs 9.23 KB
ESM dist\index.mjs.map 20.09 KB
ESM ⚡️ Build success in 18ms
CJS dist\index.js 11.19 KB
CJS dist\index.js.map 20.22 KB
CJS ⚡️ Build success in 19ms
DTS Build start
DTS ⚡️ Build success in 908ms
DTS dist\index.d.mts 1.01 KB
DTS dist\index.d.ts 1.01 KB
I found some more options handling logic in processPatterns that can be initially handled in getOptions. I also extended the InternalProps with cwd and expandDirs so we need to pass less arguments over to normalizePattern. Even though the build command says I've added 30 bytes in size again, it should still be smaller when minified.
Edit: After reverting the InternalProps extension and using GlobOptions instead, I managed to save 20 bytes more.
We are almost at 1KB reduced file size, basically almost 10% reduction. The only question is the behavior of options.followSymbolicLinks, options.onlyFiles and options.caseSensitiveMatch in the fdir options creation. They have an explicit comparison with false, implying a different behavior when it's undefined. This could be mitigated and save the bytes by omitting those explicit comparisons by defining a base option value for each of them, ensuring no undefined is passed through.
So this question is: Do false and undefined have a different behavior for these options?
yes they do, because those options are meant to default to true
Okay, so the undefined can be prevented by initializing all of them as true when processing tinyglobby's options, got it. It's easier to read that way as they don't have a third hidden state causing bugs.
okay! i thought initializing them to true was less performant, although that was just a guess and i have no actual evidence behind it
Well, there should always be a balance of package size and performance. If you care too much about package size, it's going to be slow, if you care too much about performance, it's going to be unreadable (because everything theoretically makes the program slower). I know one or two big projects that accidentally introduced this third buggy state of undefined and people kept building around it. For example, I wanted to debloat npm-check-updates and had to start over multiple times, because of it.
Defining all necessary default options makes the code look bigger, but is the most clean and easy to maintain way. You can define your options type as strict as possible, are able to process it centrally and use the input value with Partial before processing it. This helps you understand your options more and don't need to make sure over and over again that everything is correctly processed and bug-free.
Using this approach made this project almost 1KB smaller with this PR.
Speaking of performance: Since objects are dynamically sized and shaped, JS engines introduced the concept of a "shape" to optimize and organize the heap by implicitly defining two objects with the same properties as one fixed "shape" like two instances of a class. This improves performance as the engine becomes aware of what kind of object you want to use. TypeScript helps you utilize this concept better by defining interfaces for your objects or force you to define all properties of a class outside of the constructor to make the shape of an object static. So changing the values of properties is extremely fast, but adding and removing properties with undefined or delete is slow.
@SuperchupuDev Okay, I'm done and everything should be ready. If there is anything wrong with it, Let me know.
I'd be happy if this could be merged 🙏
Edit: Also I optimized a tiny thing from #91. Instead of splitting every time, we can just check input.startsWith('..'), because we are only looking at the first part.
great! will check it out at some point. no ETA since i got a cold and there's no rush for a new release right now anyways
Oh no!! Get well soon and don't stress yourself! ♥
Branch is now up-to-date again. Also I found a tiny thing from #100 that could be shortened.
Would be cool, if this could be merged 🙏
Branch is back up to date. If you need anything changed, just open a conversation. Hope this will get merged 🙏
me too, i still haven't checked it out and my next focus will probably be to work on some windows fixes. but at some point i will review
I understand. Thank you for the update!
Okay, the branch is back up to date. The whole thing should be around 10% smaller now, if this gets merged.
Are there any more things that can be reduced in size? In my opinion, yes. Namely switching from default imports of path and posix to destructured imports of normalize, relative and so on. While bundlers like esbuild don't mangle property function names, they can alias destructured imports to short names.
It depends if you are open to that change in this or a new PR.
I saw that you updated the branch again. Just give me a go, if you are ready to review my PR, until then I'm stopping further updates on my branch.
yeah i plan to push a few things these days that might change the code structure a bit, will tell you when i'm done 👍 thank you for your work on this pr regardless
hello!! i believe i'm not going to significantly change the code structure of the repo these days, so if you bring this pr up to date i will happily review it. sorry for the wait :heart:
Wonderful! Thank you very much for notifying me. I will take care of it this weekend. If I find some optimizations from your recent commits, I will add them. You can rename the PR title how it fits you best for future reference, since it turned more into a rather big project refactoring.
I tried my best over the last few days to rebase it, but even my vscode has given up half-way through. Has fdir changed the type to APIBuilder? I couldn't find anything about it in the changelog.
Hi, no problem. I'm also struggling to find footing with the massively increased complexity from before.
I actually needed the APIBuilder for typing the return type of buildFdir. Now, with the relative callback, I added a structure called GlobCrawler. It improves performance on a microscopic level by not destructuring an array and is more expandable. Since fdir now doesn't expose APIBuilder anymore, I just copied it over. It's super small anyway.
Currently, the branch appears bigger, because I still have the old and new version of buildFDir. I should mark it as a draft, because I'm still migrating from the old version to the new version. But, I can guarantee you that I was able to decrease the output size.
I actually got a new PC which finally allows me to run Rust projects, so I can finally lint it myself🎉
I now migrated to the new version, reduced the size of the fdir options and also added something new:
It is important that you not only check for the existence of the FSLike functions, but also ensures that they are also functions. Otherwise you could just add anything to fs like:
fs: {
"readdir": "Hello",
"stat": 5,
"realpath": { "toast": "peanut-butter" }
}
I'll leave it as a draft now, because I still want to check if I can still find some code reductions.
I'm officially done!
Sadly, the extra options, validation and utilities increased the output size by a small margin (gzipped 300 bytes). Here is the final size comparison:
Before:
ℹ [CJS] dist\index.cjs 14.22 kB │ gzip: 4.32 kB
ℹ [CJS] dist\index.cjs.map 29.16 kB │ gzip: 8.66 kB
ℹ [CJS] 2 files, total: 43.38 kB
ℹ [ESM] dist\index.mjs 12.92 kB │ gzip: 3.85 kB
ℹ [ESM] dist\index.mjs.map 29.10 kB │ gzip: 8.63 kB
ℹ [ESM] dist\index.d.mts 4.68 kB │ gzip: 1.63 kB
ℹ [ESM] 3 files, total: 46.70 kB
ℹ [CJS] dist\index.d.cts 4.68 kB │ gzip: 1.63 kB
ℹ [CJS] 1 files, total: 4.68 kB
✔ Build complete in 139ms
After:
ℹ [CJS] dist\index.cjs 14.35 kB │ gzip: 4.63 kB
ℹ [CJS] dist\index.cjs.map 27.09 kB │ gzip: 8.53 kB
ℹ [CJS] 2 files, total: 41.44 kB
ℹ [ESM] dist\index.mjs 13.04 kB │ gzip: 4.16 kB
ℹ [ESM] dist\index.mjs.map 27.03 kB │ gzip: 8.49 kB
ℹ [ESM] dist\index.d.mts 4.70 kB │ gzip: 1.62 kB
ℹ [ESM] 3 files, total: 44.77 kB
ℹ [CJS] dist\index.d.cts 4.70 kB │ gzip: 1.62 kB
ℹ [CJS] 1 files, total: 4.70 kB
I could try to inline things even further, but this would make the code uglier and possibly harder to understand.
I figured out why you wrote [crawler, relative] instead of using an object structure, as this saves output size. I managed to go below the unminified output size, but it's still a mystery to me why the gzip is still bigger:
ℹ [CJS] dist\index.cjs 13.99 kB │ gzip: 4.54 kB
ℹ [CJS] dist\index.cjs.map 26.54 kB │ gzip: 8.37 kB
ℹ [CJS] 2 files, total: 40.53 kB
ℹ [ESM] dist\index.mjs 12.68 kB │ gzip: 4.08 kB
ℹ [ESM] dist\index.mjs.map 26.47 kB │ gzip: 8.33 kB
ℹ [ESM] dist\index.d.mts 4.70 kB │ gzip: 1.62 kB
ℹ [ESM] 3 files, total: 43.85 kB
ℹ [CJS] dist\index.d.cts 4.70 kB │ gzip: 1.62 kB
ℹ [CJS] 1 files, total: 4.70 kB
This results in a whopping: 1.6% output size reduction for CJS and 1.8% for ESM unminified 🎉
Okay, now I believe I can't find anything anymore. I had to alter a test, because the TINYGLOBBY_DEBUG was evaluated in the test case before you were able to create the mockAccess to the property. But now, there is an unminified reduction of ~4%:
ℹ [CJS] dist\index.cjs 13.74 kB │ gzip: 4.50 kB
ℹ [CJS] dist\index.cjs.map 26.39 kB │ gzip: 8.32 kB
ℹ [CJS] 2 files, total: 40.14 kB
ℹ [ESM] dist\index.mjs 12.43 kB │ gzip: 4.04 kB
ℹ [ESM] dist\index.mjs.map 26.33 kB │ gzip: 8.28 kB
ℹ [ESM] dist\index.d.mts 4.70 kB │ gzip: 1.62 kB
ℹ [ESM] 3 files, total: 43.47 kB
ℹ [CJS] dist\index.d.cts 4.70 kB │ gzip: 1.62 kB
ℹ [CJS] 1 files, total: 4.70 kB
congrats for figuring out the size reduction :tada:! sadly this looks really hard to review due to the git diff shenanigans, would you mind splitting it up into smaller prs (or one new pr with many small commits)? that way we can also avoid problems with merging from main (i see some leftover code from older versions of the library here that was likely introduced that way) since i don't plan to push anything to main before this