feat: Add Expected extensions output mode to Magika CLI
Add -e / --expected-exts mode to Magika CLI, which output one or several expected file extensions of input file, in case the input extension is missed or incorrect
Thank you for the PR! I'll need to give some thoughts about it, we have received a similar request and need to think how to integrate it for the long term. Leaving this open for now for visibility; will follow up later on. Thanks!
@reyammer: I think we should think about this as a long-term feature. And we also need to update more correct/expected extensions to https://github.com/google/magika/blob/main/python/magika/config/content_types_config.json
I see lots of extensions is missed (empty) here, and some are missing the full list of extensions, for example: Javascript should include .ts, .jsx, .ts also.
yes, indeed. We are already working on v2, and adding many more extensions / types. your examples are very spot on. And sorry for not following up on this, didn't have time to think about this and we are currently pushing for starting new training round. Will get to this once I have a moment, and I agree we should have a feature in this direction!
Hi @reyammer , should I close this? As I see we have new code adding to main branch and leading to conflict on this PR.