ACE icon indicating copy to clipboard operation
ACE copied to clipboard

Add pdfparser2 module

Open KarmaPenny opened this issue 6 years ago • 0 comments

I created a pdfparser in golang that does everything the existing pdfparser does and much much more, plus its like 30x faster. Details on it can be found here

Usage:

pdfparser -f input.pdf output/

The above command creates the following files in the output dir:

  • commands.txt - list of commands run by launch actions
  • contents.txt - the text content of the pdf (can be scripts and contain urls etc.)
  • errors.txt - list of format errors and abnormalities that we might be able to detect on
  • files.txt - list of md5 hash and path of referenced embedded and external files. Embedded files are extracted to the output dir using the md5 as the file name.
  • javascript.js - javascript of all actions in the pdf
  • raw.pdf - a decrypted and decoded version of the pdf
  • urls.txt - list of urls referenced by actions

We should create an ace module that scans all the above files with appropriate yara rules. We may also want to add some of the info in the above files as observables, like embedded files, file paths, urls etc

KarmaPenny avatar May 31 '19 17:05 KarmaPenny