paroxython icon indicating copy to clipboard operation
paroxython copied to clipboard

Use advanced static analysis tools for feature detection

Open laowantong opened this issue 5 years ago • 0 comments

The techniques currently used to match algorithmic features in a given source code are:

  • a conversion of its AST into a textual flattened representation;
  • regular expressions for low-level features;
  • SQL queries for derived features.

This is not quite satisfying for a number of reasons:

  • Although the creation of new regular expressions for Paroxython is partially automated (see suggest_regexp.py) and do not require advanced functionalities like recursion (thanks to the flattening of the AST), they remain notoriously difficult to read and maintain.
  • Using SQL to derive new features is highly unusual, and sometimes verbose.
  • No support for inference.
  • Not based on sound theoretical foundations, resulting in fragile patterns, false negative on corner cases, etc.

It would be interesting to explore third-party static analytic tools, and try to replace gradually my home-made patterns. The tools I am currently aware of are:

I have no experience in any of them, though.

laowantong avatar Aug 30 '20 10:08 laowantong