bash-parser icon indicating copy to clipboard operation
bash-parser copied to clipboard

Error in double-quoting.js when handling quoted `)` and `$` characters

Open cetfor opened this issue 7 years ago • 4 comments

Hey, awesome job on bash-parser. I'm finding this extremely useful in my own work.

I've run into an issue when parsing ) and $ characters in quoted strings. Let me give an example to help explain.

Consider the following line:

command="$(echo "$input" | sed -e "s/^[ \t]*\([^ \t]*\)[ \t]*.*$/\1/g")"

This line is a part of a larger, more complex bash script I'm analyzing. There are actually two (2) issues here.

  1. At column 64, the $ causes an error.
  2. At column 55, the ) causes an error.

Removing the $ leads the parser to believe the ) at column 55 is the closing parentheses rather than a string literal.

Both issues boil down to line 50 in double-quoting.js, where reducers is undefined.

I understand this is a strange case since the issue here exists in parsing a string pattern. I'm honestly not too sure how to go about addressing this yet. If I come up with anything I'll make sure to comment here. Also, if you have any ideas please share them, I'd be happy to work on a fix as soon as I think of the best way to handle these cases.

cetfor avatar Aug 09 '18 19:08 cetfor

Hey @parro-it, I'm currently working on a fix for this issue and hope to get a PR in today. FYI, I've resolved the "build failures" due to test-loc.js which is just a silly double-quoting issue that throws a ton of errors - nothing critical. I hope we can ping back and forth on the PR and get something implemented to fix this. Thanks!

cetfor avatar Aug 21 '18 14:08 cetfor

As far as the issue goes that causes Error: TypeError: Cannot read property 'doubleQuoting' of undefined it appears line 38 of expansion-start.js, the method previousReducer can take a third argument of reducers, but it's not set. See expansion-parameters.js line 31 for that definition.

Changing: return state.previousReducer(state, [char].concat(source));

to: return state.previousReducer(state, [char].concat(source), reducers);

Resolves the issue Error: TypeError: Cannot read property 'doubleQuoting' of undefined - all unit tests run through npm test continue to pass with this change.

cetfor avatar Aug 21 '18 21:08 cetfor

With the above change in place, and using the following script:

const ast = parse('command="$(echo "$input" | sed -e "s/^[ \t]*\([^ \t]*\)[ \t]*.*$/\1/g")"');
console.log(JSON.stringify(ast, null, 2));

I get the following error:

/Users/john/Documents/cetfor/bash-parser/src/index.js:52
                throw new Error(err.stack || err.message);
                ^

Error: TypeError: Cannot read property 'char' of undefined
    at expansion.map.xp (/Users/john/Documents/cetfor/bash-parser/src/utils/tokens.js:80:21)
    at Array.map (<anonymous>)
    at tokenOrEmpty (/Users/john/Documents/cetfor/bash-parser/src/utils/tokens.js:76:45)
    at start (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/tokenizer/reducers/start.js:51:18)
    at tokenizer (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/tokenizer/index.js:193:13)
    at tokenizer.next (<anonymous>)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/iterable-lookahead/index.js:51:24)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/map-iterable/index.js:33:30)
    at filterIterator (/Users/john/Documents/cetfor/bash-parser/node_modules/filter-iterator/index.js:6:12)
    at filterIterator.next (<anonymous>)
    at parse (/Users/john/Documents/cetfor/bash-parser/src/index.js:52:9)
    at Object.<anonymous> (/Users/john/Documents/cetfor/bash-parser/ex.js:9:13)
    at Module._compile (module.js:652:30)
    at Object.Module._extensions..js (module.js:663:10)
    at Module.load (module.js:565:32)
    at tryModuleLoad (module.js:505:12)
    at Function.Module._load (module.js:497:3)
    at Function.Module.runMain (module.js:693:10)
    at startup (bootstrap_node.js:191:16)
    at bootstrap_node.js:612:3

The issue here exists in tokens.js line 80 in the tokenOrEmpty function. In this case, xp.loc has no end property!

I added the following dirty patch to see if I can continue debugging the issue:

// console.log('aaa', {token: state.loc, xp: xp.loc});
// return Object.assign({}, xp, {loc: {
// 	start: xp.loc.start.char - state.loc.start.char,
// 	end: xp.loc.end.char - state.loc.start.char
// }});
if(xp.loc.hasOwnProperty('end')) {
	return Object.assign({}, xp, {loc: {
		start: xp.loc.start.char - state.loc.start.char,
		end: xp.loc.end.char - state.loc.start.char
	}});
} else {
	return Object.assign({}, xp, {loc: {
		start: xp.loc.start.char - state.loc.start.char
	}});
}

This isn't ideal - just a way to contiue moving towards the issue. When I re-run the example with this "patch" I get the following error:

/Users/john/Documents/cetfor/bash-parser/src/index.js:50
                        throw err;
                        ^

SyntaxError: Unclosed "
    at tk (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/rules/syntaxerror-oncontinue.js:7:10)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/map-iterable/index.js:35:18)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/map-iterable/index.js:33:30)
    at Object.lex (/Users/john/Documents/cetfor/bash-parser/src/shell-lexer.js:6:31)
    at lex (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/built-grammar.js:317:27)
    at Parser.parse (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/built-grammar.js:330:26)
    at parse (/Users/john/Documents/cetfor/bash-parser/src/index.js:34:22)
    at setCommandExpansion (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/rules/command-expansion.js:17:21)
    at tokensUtils.setExpansions.token.expansion.map.xp (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/rules/command-expansion.js:38:12)
    at Array.map (<anonymous>)

I uncommented the console.log({value, xp}) statement on line 55 of command-expansion.js https://github.com/vorpaljs/bash-parser/blob/6fa4ca52fbb92f9eb6b94e2e8c16768210b7af3d/src/modes/posix/rules/command-expansion.js#L55 to see what the command was being seen as, and I get this:

{ loc: { start: 9, end: 50 },
  command: 'echo "$input" | sed -e "s/^[ \t]*([^ \t]*',
  type: 'command_expansion' }
{ loc: { start: 1, end: 6 },
  parameter: 'input',
  type: 'parameter_expansion' }

So for some reason, the command is being truncated where the \) is. I would expect this command to be echo "$input" | sed -e "s/^[ \t]*\([^ \t]*\)[ \t]*.*$/\1/g". Perhaps state.escaping is not being set when this occurs?

I'm starting to think all of this happens within the lexerPhases. Any insight would be greatly appreciated.

cetfor avatar Aug 21 '18 21:08 cetfor

Hi there, stumbled upon this problem as well (https://github.com/daniel-sc/bash-shell-to-bat-converter/issues/13) any updates/progress here?

daniel-sc avatar Jul 06 '20 11:07 daniel-sc