shellcheck doesn't warn against misspelled function names
myFunction() {
echo "hello world"
}
myFuncton
Here's what shellcheck currently says:
Nothing
Here's what I wanted or expected to see:
myFuncton is not a known function or typically available command
Why shouldn't shellcheck run a "command -v" on all command invocations? Obviously it might still pick up false positives where the PATH environment variable gets modified while the script executes, but that doesn't seem a great drama.
The main reason why it doesn't is that this would cause a lot of discrepancies between machines. A script that passes cleanly on one dev machine would fail on another or on a CI server.
I would love to warn about missing commands, but I can't think of a good way to do that which doesn't come at a great cost to the user in one way or another.
It's a linter though - surely its purpose is to make it easier to detect issues that MAY happen at runtime, not a compiler that's there to tell you whether your code can be executed at all. If you really were worried about different behaviour on different machines then have it that ALL external commands that are not linux-standard are flagged, with the option to specify them as being allowed via a directive or command line. Admittedly getting a list of "linux-standard" commands is a challenging in itself, though I'd be fine with it being somewhat minimal and having to enhance it as necessary.
Just my 2$, but I personally feel that it would be impossible to do this reliably. You would be forced to build a list of (almost) all Unix/Linux commands, in order to reliably tell the difference between a misspelled function name and an external command. And where do you draw the line ? If 'only' 1 character is different between the definition of a function and when it's (supposedly) called/executed ? When two characters differ ? Three ? And perhaps the user has written an external shell script that he/she is calling with a similar name as the name of a function defined in the script you are checking, which you cannot possibly know about when analyzing the script ? I understand why one would want this feature, but I just think that it is not possible to do it reliably so it is better left out. Again, just my 2$. I'll crawl back under my rock now ;)
Why does it need to be "reliable"? It just needs to be better than humans are. The reason I installed this tool in the first place was because I had a long-running script abort irretrievably after 30 minutes or so with a misspelled function name. As it happens most of the linter errors I actually got were false-positives (warning about theoretically possible problems that couldn't actually happen in the way my script was getting used).
Why does it need to be "reliable"?
Well, if it isn't correct (almost) all of the time, then the sheer amount of false positives/negatives will annoy people intensely, leading them to silence the particular check by default (which would defeat it's purpose), or stop using the tool altogether.
It just needs to be better than humans are.
It does indeed need to perform better than a human would. I just think that (for the reasons mentioned in my earlier reply) it won't be better than a human in this particular case.
As it happens most of the linter errors I actually got were false-positives
I honestly don't see what the point is that you are trying to make here ? Surely you cannot mean "I already get lots of false-positives, so it's okay to knowingly/willingly introduce even more (potential) false-positives" ?
I'm saying that the nature of what a linter does means there's always a high chance of false positives. I would suggest that detecting misspelled function names the chance is a good deal lower than for many of the checks it does already do. And there's always a way to filter them out with the right directives/command line options. As for it not better than human, I don't know where you work but over 50% of my time doing code reviews is correcting other programmers' typos/spelling errors. My spelling is pretty damn good most of the time but I'm human enough that I'm pretty likely to occasionally misspell, mis-remember or mistype a function name, and in all the other programming environments I work in I rely on tools to pick that up via static analysis, rather than at runtime.
Well, perhaps I am overestimating the difficulty of programmatically detecting the misspelling of the name of a function in a shellscript. Since you have stated that you program for a living, perhaps you could say how you would go about detecting a misspelled function in a shellscript ?
I already did that in my first post - just run command -v on every command invocation (unless it matches a provided whitelist). The next level of doing fuzzy matches and suggesting the most likely intended command would be super nice of course, but I'm not proposing that.
just run
command -von every command invocation
The problem with that approach (or running 'which cmd' for example), is that the results will only be accurate if both of the following conditions are true :
1.) shellcheck is executed on the host it is intended to run on 2.) shellcheck is executed as the user it is intended to run as
which really is a huge issue, as neither is required currently.
(unless it matches a provided whitelist)
In order to create such a whitelist you would be forced to build a list of (almost) all Unix/Linux commands in existence (and keep it up to date), which might be possible in theory but is (almost) impossible to do in practice, and therefore also is a huge issue.
Yes, it's theoretically possible that on my machine I happen to have some weird command available called uncompres and therefore the linter fails to identify that I've accidentally misspelled uncompress as uncompres, but that's hardly a reason for throwing away the massively greater likelihood that it does correctly identify certain command names as being misspelled. And I would see the whitelist as something user-provided in order to specify the names of commands that are known to be available on ALL environments where the script is intended to be run but might not be available in the environment I'm running the linter. I'm seriously baffled that you think either of these edge cases is an issue of a magnitude that is even close to the usefulness of such functionality.
Maybe I'm completely missing something, but for me, shellcheck is something I use as a developer to run on my machine while I'm working on the script. If it tells me something is wrong, I either fix it, or determine it's not really an issue and either ignore the error or put in a directive or command line to tell the linter to ignore that particular case. As long as it significantly improves the probability that the script will run successfully on its target environment (by clearly listing likely problems and not burying what it reports in too much noise) it's doing its job.
The biggest objection I can think of is still the possibility that the script modifies the PATH variable - e.g. it's not hard to think of a case where you have a script and a sub-library of other scripts it calls, and the first thing your script does is adds that directory to PATH so those scripts can be called without specifying their location. I'd still see this as an edge case that can be solved using a whitelist directive however. (BTW in cases where command invocations are specified with a relative directory, e.g.
./myFunctonit's probably better on balance just to skip the check)
As a developer, I would be very surprised to find out that ShellCheck was invoking commands on my computer while doing static analysis of scripts.
You're assuming that every command has a -v flag that outputs the version and never does anything destructive. In reality, there's just no such thing as a "safe" flag that can let you run a command and know it won't have any adverse side effects. What if I have a command that deletes files but doesn't check if it was invoked with flags? As soon as I run ShellCheck on a script using that command, data will be lost. Unexpected data loss is not an awesome feature for a static analysis tool to have.
Sorry, but anything where possible commands are invoked -- in any way -- is a non-starter. That will lead to very bad things happening.
What makes you think I suggested that? "command -v" is a command itself, which takes the name of the command as an argument after -v. It certainly doesn't execute the command.
You're assuming that every command has a
-vflag that outputs the version and never
@irgeek, @wizofaus is proposing the literal shell built-in command.
$ echo $BASH_VERSION
4.4.19(1)-release
$ command -v pandoc
/usr/bin/pandoc
$ type -p pandoc
/usr/bin/pandoc
$ command -v mdv
mdv
$ command -V mdv
mdv is a function
mdv ()
{
if type -p pandoc elinks > /dev/null; then
pandoc ${1:? "No input document"} | elinks -force-html;
fi
}
$ help command
command: command [-pVv] command [arg ...]
Execute a simple command or display information about commands.
Runs COMMAND with ARGS suppressing shell function lookup, or display
information about the specified COMMANDs. Can be used to invoke commands
on disk when a function with the same name exists.
Options:
-p use a default value for PATH that is guaranteed to find all of
the standard utilities
-v print a description of COMMAND similar to the `type' builtin
-V print a more verbose description of each COMMAND
Exit Status:
Returns exit status of COMMAND, or failure if COMMAND is not found.
$ help type
type: type [-afptP] name [name ...]
Display information about command type.
For each NAME, indicate how it would be interpreted if used as a
command name.
Options:
-a display all locations containing an executable named NAME;
includes aliases, builtins, and functions, if and only if
the `-p' option is not also used
-f suppress shell function lookup
-P force a PATH search for each NAME, even if it is an alias,
builtin, or function, and returns the name of the disk file
that would be executed
-p returns either the name of the disk file that would be executed,
or nothing if `type -t NAME' would not return `file'
-t output a single word which is one of `alias', `keyword',
`function', `builtin', `file' or `', if NAME is an alias,
shell reserved word, shell function, shell builtin, disk file,
or not found, respectively
Arguments:
NAME Command name to be interpreted.
Exit Status:
Returns success if all of the NAMEs are found; fails if any are not found.
Using command could cause false positives if a script is being developed for a different platform.
As an anecdote, this is very common at my work where our dev machines are all Linux, but the production machines are FreeBSD.
It's also possible to dynamically create shell functions with eval. This would get tripped up then too.
Using
commandcould cause false positives if a script is being developed for a different platform.As an anecdote, this is very common at my work where our dev machines are all Linux, but the production machines are FreeBSD.
Sure, not denying there might be false positives in some cases. But for my money, it's still the most valuable bit of checking a shell script linter can do.
I think, it can be configured with options like this:
--warn-wrong-commands --assume-installed=%coreutils --assume-installed=curl
So you can configure which commands are available on target system.
FWIW I've basically stopped using shell scripting now largely because of the lack of good tools to detect problems such as this before run-time.
There are some similarities between script variables and environment variables and ShellCheck can distinguish both via naming convention.
https://github.com/koalaman/shellcheck/wiki/SC2154 https://github.com/koalaman/shellcheck/wiki/SC2153
Is it any possibility to apply some naming conventions for functions to distinguish them from commands?
Possible naming conventions prefixes to express explicitly that something is a script function:
-
fn_- self-explanatory, which would denote a function, -
_- Pythonic way for an access specifier, -
__- C++ way for names/identifiers reservation in standard in purpose to prevent it to be used by users code.
I see this is a very old bug, however, I just ran into this myself and wondered why shellcheck did not warn me on this. I was making a call to check_pt function, far down in the code and the function was defined function check_pet {..}. I expected shellcheck to discover this and warn me. It warns me about missing/invalid/unused variables, so why not also do the same for functions? I would expect this behavior as I get this behavior in my linters for python (ruff), and for php (phpstan).