humanify icon indicating copy to clipboard operation
humanify copied to clipboard

'MissingSemicolon'

Open springrider opened this issue 5 months ago • 1 comments

humanify gemini -m gemini-1.5-pro --apiKey="" obfuscated.js

obfuscated.js

I tried both openai & gemini, all encountered the same error:

Processing file 7/18
/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:367
    const error = new SyntaxError();
                  ^

SyntaxError: unknown: Missing semicolon. (17:29)

  15 | a.destroy = function () {};
  16 | exports.shared = a;
> 17 | exports.Ticker = i.default;ar n = t[r];
     |                              ^
  18 |       n.enumerable = n.enumerable || false;
  19 |       n.configurable = true;
  20 |       if ("value" in n) {
    at constructor (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:367:19)
    at Parser.raise (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:6630:19)
    at Parser.semicolon (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:6926:10)
    at Parser.parseExpressionStatement (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:13294:10)
    at Parser.parseStatementContent (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:12908:19)
    at Parser.parseStatementLike (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:12776:17)
    at Parser.parseModuleItem (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:12753:17)
    at Parser.parseBlockOrModuleBlockBody (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:13325:36)
    at Parser.parseBlockBody (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:13318:10)
    at Parser.parseProgram (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:12634:10)
    at Parser.parseTopLevel (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:12624:25)
    at Parser.parse (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:14501:10)
    at parse (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/parser/lib/index.js:14535:38)
    at parser (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/core/lib/parser/index.js:41:34)
    at parser.next (<anonymous>)
    at normalizeFile (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/core/lib/transformation/normalize-file.js:64:37)
    at normalizeFile.next (<anonymous>)
    at run (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/core/lib/transformation/index.js:22:50)
    at run.next (<anonymous>)
    at transform (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/@babel/core/lib/transform.js:22:33)
    at transform.next (<anonymous>)
    at step (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:261:32)
    at /Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:273:13
    at async.call.result.err.err (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:223:11)
    at /Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:50:45
    at step (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:287:14)
    at /Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:273:13
    at async.call.result.err.err (/Users/springrider/.nvm/versions/node/v22.17.0/lib/node_modules/humanifyjs/node_modules/gensync/index.js:223:11) {
  code: 'BABEL_PARSE_ERROR',
  reasonCode: 'MissingSemicolon',
  loc: Position { line: 17, column: 29, index: 345 },
  pos: 345,
  syntaxPlugin: undefined
}

I have attached the file need to be de-obfuscated, it's relative large, and the file is working without problem. Not sure why it constantly generate code that has basic grammar issue...

springrider avatar Sep 09 '25 08:09 springrider

@springrider As a starting point, it's probably worth checking if you get that error when running the file through webcrack on its own (which humanify uses under the hood); as that should help narrow down if the issue is within humanify itself or not:

  • https://github.com/j4k0xb/webcrack
    • https://webcrack.netlify.app/

Edit: Further context and suggestions from GitHub Copilot using GPT-5 (that I haven't deeply / manually verified, so take it with a grain of salt):

Short answer:

This isn’t an LLM problem. The failure happens before any model is used, when Humanify runs the extracted code through Babel. The snippet Babel is fed is syntactically invalid because a character has been chopped at a module boundary:

Expected: …; var n = t[r]; Got: …; ar n = t[r];

That “v” being missing makes the code invalid and triggers the “MissingSemicolon” error right at the transition after exports.Ticker = i.default;.

Root cause:

  • Pipeline location: The log “Processing file 7/18” indicates the input was first split/unpacked into multiple modules and Humanify is handling one of those pieces. Babel then parses each piece. The error means the piece handed to Babel is already broken.
  • Symptom: The broken fragment shows a classic off‑by‑one cut at a module boundary. The previous chunk ends with …i.default;, and the next chunk is supposed to start at var n = t[r]; (a typical transpiled helper pattern). Instead, it starts one character too far in, producing ar n = t[r];.
  • Responsibility: This points to the bundler-unpacking step (webcrack’s module extraction) or how Humanify uses its offsets/strings, not to the LLMs and not to Babel. The original bundle “works without problem,” so the corruption happens during splitting/rewriting of the bundle into modules.

How to confirm quickly:

  • Add a debug print of the raw text for “file 7/18” right before it is passed to Babel. If you already see …;ar n = t[r]; there, the corruption happened in the unbundling step (webcrack or boundary slicing in Humanify). If it’s …;var n = t[r]; but becomes broken later, then Humanify’s own transforms are at fault.
  • Alternatively, run the same input through webcrack alone and inspect the extracted module 7. If it starts with ar n, it’s a webcrack extraction bug.

Likely root in webcrack:

The pattern matches code that comes from Babel’s helper for defining properties (enumerable/configurable…). These helpers commonly appear right after export assignments in transpiled bundles. An off-by-one when slicing module code (start index shifted by 1) would drop the leading v in var.

Recommended fixes

Upstream/structural:

  • Check for and update to the latest webcrack. If this persists, report a bug upstream with the attached input; it’s almost certainly a boundary-detection issue in a specific wrapper pattern.
  • In Humanify’s integration, prefer using the full module source string returned by the unbundler (if provided) rather than re-slicing the original bundle with numeric offsets. If offsets must be used, verify they’re inclusive/exclusive correctly and add a guard that asserts the first non-whitespace token is a valid keyword/identifier (catch “ar ” at file start).
  • If Humanify applies any regex-based cleanup around export/require wrappers, review those patterns; a greedy match that consumes the preceding character would produce exactly this symptom.

Mitigations in Humanify:

  • Make Babel parsing tolerant: try @babel/parser with errorRecovery: true and, on parse failure, fall back to passing the raw text to the model without AST-based transforms. This avoids hard-stopping on a single module and still lets the deobfuscation proceed.
  • Add a defensive “boundary sanitization” step: when a module starts with ar n = … immediately after a semicolon in the previous chunk, insert the missing v (or more generally, ensure a newline/whitespace boundary and do not start mid-token). This is a pragmatic patch until the upstream fix lands.
  • Provide a CLI flag to skip bundle unpacking and process the original single file (which is valid JS) when the unpacker stumbles on a particular bundle.

What the user can do now:

  • As a quick workaround, run Humanify in single-file mode (skip unbundling) on this input, or manually fix the specific broken module by changing ar n to var n and re-running to get past this blocker.
  • Share the extracted “file 7/18” content around that boundary if possible; that will make isolating the exact off‑by‑one easier and speed up a proper fix.

0xdevalias avatar Sep 09 '25 10:09 0xdevalias