magika icon indicating copy to clipboard operation
magika copied to clipboard

feat: Add Expected extensions output mode to Magika CLI

Open nhonx opened this issue 2 years ago • 4 comments

Add -e / --expected-exts mode to Magika CLI, which output one or several expected file extensions of input file, in case the input extension is missed or incorrect

nhonx avatar Feb 19 '24 07:02 nhonx

Thank you for the PR! I'll need to give some thoughts about it, we have received a similar request and need to think how to integrate it for the long term. Leaving this open for now for visibility; will follow up later on. Thanks!

reyammer avatar Feb 19 '24 10:02 reyammer

@reyammer: I think we should think about this as a long-term feature. And we also need to update more correct/expected extensions to https://github.com/google/magika/blob/main/python/magika/config/content_types_config.json I see lots of extensions is missed (empty) here, and some are missing the full list of extensions, for example: Javascript should include .ts, .jsx, .ts also.

nhonx avatar Feb 22 '24 09:02 nhonx

yes, indeed. We are already working on v2, and adding many more extensions / types. your examples are very spot on. And sorry for not following up on this, didn't have time to think about this and we are currently pushing for starting new training round. Will get to this once I have a moment, and I agree we should have a feature in this direction!

reyammer avatar Feb 23 '24 09:02 reyammer

Hi @reyammer , should I close this? As I see we have new code adding to main branch and leading to conflict on this PR.

nhonx avatar Apr 01 '24 05:04 nhonx