malcontent icon indicating copy to clipboard operation
malcontent copied to clipboard

Ideate on selectively compiling rules based on the scanned paths

Open egibs opened this issue 1 year ago • 1 comments

Right now, we compile every rule in the rules and third_party directories which comes out to ~15,299 rules in total (after our bad rule exclusions). Not all rules are applicable to all paths, so determining which rules best fit which paths would help with scanning speed.

Even if we do something with tags or a rule reorganization and multiple new embed filesystems, I think this would have benefits.

The main caveat here is that we don't want to omit possible findings by excluding rules so this effort will require some thought.

egibs avatar Jul 27 '24 12:07 egibs

Some thoughts:

  • Exclude rules that fall outside of the --min-risk level. This will become more important for scan, which excludes anything lower than HIGH.
  • Implement a meta field in our YARA files that filters out rules when a file doesn't match a programkind. For example, a "filetype" that specifies a matching MIME type.
  • We include many rules that may be irrelevant. It'd be interesting to benchmark for which rules are slow and obsolete, and which rule providers overall are slow and should be disabled.

tstromberg avatar Sep 13 '24 12:09 tstromberg