codeql icon indicating copy to clipboard operation
codeql copied to clipboard

JS: Skip minified file if avg line length > 200

Open asgerf opened this issue 1 month ago • 3 comments

Improves the detection of minified files, by classifying files with an average line length over 200 as minified.

Minifiers typically compile a file to one long line, but sometimes there's a copyright comment in front of it, so we need to account for some normal-looking lines. Sometimes multiple such files are concatenated in a bundle.

Note that we already skip certain files based on filenames, such as *.min.js, but previously we did not detect minification based on contents.

asgerf avatar Dec 01 '25 13:12 asgerf

I'm wondering if there's some classes of cases we're going to miss by doing that (i.e. I would imagine URLs could result in >200 length lines, which are potentially prone to injection?), and if: 1) we'd be able to find this out when rolling it out; or 2) have a slightly different heuristic to avoid this?

I don't have a strong opinion here but something like:

const apiEndpoint = "https://api.example.com/v2/users/profile/settings/notifications/preferences/email?includeArchived=true&format=json&apiKey=abc123def456";
let blah = someOtherNormalJSStuff();
if(blah) {
  blah.obviouslyNotMinifiedJS();
}

Would be considered minified. No clue as to what the data looks like in terms of distribution of line lengths

miketsprague avatar Dec 05 '25 11:12 miketsprague

I'm wondering if there's some classes of cases we're going to miss by doing that (i.e. I would imagine URLs could result in >200 length lines, which are potentially prone to injection?), and if: 1) we'd be able to find this out when rolling it out; or 2) have a slightly different heuristic to avoid this?

I don't have a strong opinion here but something like:

const apiEndpoint = "https://api.example.com/v2/users/profile/settings/notifications/preferences/email?includeArchived=true&format=json&apiKey=abc123def456";
let blah = someOtherNormalJSStuff();
if(blah) {
  blah.obviouslyNotMinifiedJS();
}

Would be considered minified. No clue as to what the data looks like in terms of distribution of line lengths

I don't think that's quite right. Your code snippet is ~240 characters and 5 line breaks, for an average line length of 48 -- well below the limit.

To get above an average of 200, you either need a few really long lines, or a lot of >200 character lines.

tausbn avatar Dec 05 '25 12:12 tausbn

Ya you're right thanks @tausbn -- my brain forgot what an average is and read it just as "max" 🙃 . A 200 average length one for the above example would basically a helper js file that just had URL definitions hah

miketsprague avatar Dec 05 '25 13:12 miketsprague