magika icon indicating copy to clipboard operation
magika copied to clipboard

Add suffix as output

Open delijati opened this issue 1 year ago • 1 comments

Hi i just perfect timing.I just restored "some" files (40GB) with [1] But the filetype detection of photorec set some wrong file types. It would be super nice if the program would show me the "suffix" (like -i) aka ".md, .py, .rst, ..." so i can move the file to a folder.

[1] https://www.cgsecurity.org/wiki/PhotoRec

Current output (-i):

❯ magika -r /opt/SORTED/TXT -i  
/opt/SORTED/TXT/1/100464.txt: text/plain
/opt/SORTED/TXT/1/101476.txt: text/x-c
/opt/SORTED/TXT/1/101485.txt: text/x-c
/opt/SORTED/TXT/1/101565.txt: text/x-c
/opt/SORTED/TXT/1/101700.txt: text/plain
/opt/SORTED/TXT/1/101729.txt: text/plain
/opt/SORTED/TXT/1/101786.txt: text/x-asm
/opt/SORTED/TXT/1/101812.txt: text/x-asm
/opt/SORTED/TXT/1/101941.txt: text/x-asm
/opt/SORTED/TXT/1/105997.txt: text/x-makefile
/opt/SORTED/TXT/1/107439.txt: text/plain
/opt/SORTED/TXT/1/108033.txt: text/plain
/opt/SORTED/TXT/1/109413.txt: text/markdown
/opt/SORTED/TXT/1/111266.txt: application/javascript
/opt/SORTED/TXT/1/114086.txt: text/x-python

delijati avatar Feb 17 '24 10:02 delijati

Magika does have a list of expected extensions/suffixes for each Content type in Python config: https://github.com/google/magika/blob/main/python/magika/config/content_types_config.json

I made a PR to add the ability to export the expected extensions as output, with the arg -e https://github.com/google/magika/pull/78

nhonx avatar Feb 19 '24 07:02 nhonx