`Rex::Commands::Fs::df` fails on `tcsh`
Describe the bug
df actually returns empty hash reference with or without a device argument in FreeBSD 13.1 (latest) release
This function returns a hash reference which reflects the output of df.
https://metacpan.org/pod/Rex::Commands::Fs#df([$device])
Expected behavior
hashref reflecting df(1) command's output
How to reproduce it
task 'disk', group => 'testing', sub {
say Dumper df;
};
$VAR1 = {};
Code example
No response
Additional context
No response
Rex version
1.13.4
Perl version
5.36.0
Operating system running rex
Debian 11
Operating system managed by rex
FreeBSD
How rex was installed?
cpan client
Thanks for the report, @rwp0! Could you attach a debug output from rex -d as a file, please?
In case you'd like to take a closer look at what might be going wrong meanwhile: Rex runs df $dev, and takes the lines from STDOUT, then parses the lines.
@ferki thanks for mentioning the -d option.
It seems df (without arguments) looks for a device resulting in df: 2: No such file or directory under FreeBSD using tcsh.
Short:
[2022-06-12 17:34:41] DEBUG - ========= ERR ============
[2022-06-12 17:34:41] DEBUG - df: 2: No such file or directory
[2022-06-12 17:34:41] DEBUG - ========= ERR ============
$VAR1 = {};
Long:
/usr/bin/perlbrew exec -q --with perl-5.36.0 /home/regular/perl5/perlbrew/perls/perl-5.36.0/bin/perl /usr/bin/rex -d -H HOST df
[2022-06-12 17:34:31] DEBUG - This is Rex version: 1.13.4
[2022-06-12 17:34:31] DEBUG - Command Line Parameters
[2022-06-12 17:34:31] DEBUG - H = HOST
[2022-06-12 17:34:31] DEBUG - d = 1
[2022-06-12 17:34:31] DEBUG - Creating lock-file (Rexfile.lock)
[2022-06-12 17:34:31] DEBUG - Loading Rexfile
[2022-06-12 17:34:31] DEBUG - Enabling task_chaining_cmdline_args feature
[2022-06-12 17:34:31] DEBUG - Activating new template engine.
[2022-06-12 17:34:31] DEBUG - Disabling usage of a tty
[2022-06-12 17:34:31] DEBUG - Activating autodie.
[2022-06-12 17:34:31] DEBUG - Using Net::OpenSSH if present.
[2022-06-12 17:34:31] DEBUG - Add service check.
[2022-06-12 17:34:31] DEBUG - Setting set() to not append data.
[2022-06-12 17:34:31] DEBUG - Registering CMDB as template variables.
[2022-06-12 17:34:31] DEBUG - activating featureset >= 0.51
[2022-06-12 17:34:31] DEBUG - activating featureset >= 0.40
[2022-06-12 17:34:31] DEBUG - activating featureset >= 0.35
[2022-06-12 17:34:31] DEBUG - activating featureset >= 0.31
[2022-06-12 17:34:31] DEBUG - Creating new distribution class of type: Base
[2022-06-12 17:34:31] DEBUG - new distribution class of type Rex::TaskList::Base created.
[2022-06-12 17:34:31] DEBUG - Creating task: local-df
[2022-06-12 17:34:31] DEBUG - server: HOST
[2022-06-12 17:34:31] DEBUG - Returning existing distribution class of type: Rex::TaskList::Base
[2022-06-12 17:34:31] DEBUG - Creating task: df
[2022-06-12 17:34:31] DEBUG - server: HOST
[2022-06-12 17:34:31] DEBUG - Initializing Logger from parameters found in Rexfile
[2022-06-12 17:34:31] DEBUG - Returning existing distribution class of type: Rex::TaskList::Base
[2022-06-12 17:34:31] DEBUG - Returning existing distribution class of type: Rex::TaskList::Base
[2022-06-12 17:34:31] INFO - Running task df on HOST
[2022-06-12 17:34:31] DEBUG - Rex::Group::Entry::Server (private_key): returning
[2022-06-12 17:34:31] DEBUG - Rex::Group::Entry::Server (public_key): returning
[2022-06-12 17:34:31] DEBUG - $VAR1 = '';
[2022-06-12 17:34:31] DEBUG - Auth-Information inside Task:
[2022-06-12 17:34:31] DEBUG - private_key => []
[2022-06-12 17:34:31] DEBUG - sudo_password => [[**********]]
[2022-06-12 17:34:31] DEBUG - port => [[22]]
[2022-06-12 17:34:31] DEBUG - sudo => [[]]
[2022-06-12 17:34:31] DEBUG - password => [[%s]]
[2022-06-12 17:34:31] DEBUG - public_key => [[]]
[2022-06-12 17:34:31] DEBUG - user => [[root]]
[2022-06-12 17:34:31] DEBUG - auth_type => [[key]]
[2022-06-12 17:34:31] DEBUG - Using Net::OpenSSH for connection
[2022-06-12 17:34:31] DEBUG - Using user: root
[2022-06-12 17:34:31] DEBUG - Connecting to HOST:22 (root)
[2022-06-12 17:34:31] DEBUG - get_openssh_opt()
[2022-06-12 17:34:31] DEBUG - $VAR1 = {
'StrictHostKeyChecking' => 'no'
};
[2022-06-12 17:34:31] DEBUG - OpenSSH: key_auth or not defined: HOST:22 - root
[2022-06-12 17:34:31] DEBUG - OpenSSH options:
[2022-06-12 17:34:31] DEBUG - $VAR1 = [
'HOST',
'user',
'root',
'port',
22,
'master_opts',
[
'-o',
'ConnectTimeout=2',
'-o',
'LogLevel=QUIET',
'-o',
'StrictHostKeyChecking=no'
],
'default_ssh_opts',
$VAR1->[6]
];
[2022-06-12 17:34:31] DEBUG - OpenSSH constructor options:
[2022-06-12 17:34:31] DEBUG - $VAR1 = {};
[2022-06-12 17:34:31] DEBUG - Trying following auth types:
[2022-06-12 17:34:31] DEBUG - $VAR1 = [
'key',
'pass'
];
[2022-06-12 17:34:37] DEBUG - Current Error-Code: 0
[2022-06-12 17:34:37] DEBUG - Connected and authenticated to HOST.
[2022-06-12 17:34:37] DEBUG - Successfully authenticated on HOST.
[2022-06-12 17:34:37] DEBUG - Executing: perl -MFile::Spec -le 'print File::Spec->tmpdir'
[2022-06-12 17:34:37] DEBUG - Detecting shell...
[2022-06-12 17:34:37] DEBUG - Searching for shell: zsh
[2022-06-12 17:34:38] DEBUG - Searching for shell: ksh
[2022-06-12 17:34:38] DEBUG - Searching for shell: bash
[2022-06-12 17:34:38] DEBUG - Searching for shell: idrac
[2022-06-12 17:34:38] DEBUG - Rex::Interface::Exec::OpenSSH/executing: version
[2022-06-12 17:34:39] DEBUG - ========= ERR ============
[2022-06-12 17:34:39] DEBUG - version: Command not found.
[2022-06-12 17:34:39] DEBUG - ========= ERR ============
[2022-06-12 17:34:39] DEBUG - Searching for shell: ash
[2022-06-12 17:34:39] DEBUG - Searching for shell: csh
[2022-06-12 17:34:40] DEBUG - Searching for shell: sh
[2022-06-12 17:34:40] DEBUG - Searching for shell: tcsh
[2022-06-12 17:34:40] DEBUG - Found shell and using: tcsh
[2022-06-12 17:34:40] DEBUG - SSH/executing: set LC_ALL=C ; perl -MFile::Spec -le 'print File::Spec->tmpdir'
[2022-06-12 17:34:41] DEBUG - /tmp
[2022-06-12 17:34:41] DEBUG - Executing: which perl
[2022-06-12 17:34:41] DEBUG - Detecting shell...
[2022-06-12 17:34:41] DEBUG - Found shell in cache: tcsh
[2022-06-12 17:34:41] DEBUG - SSH/executing: set LC_ALL=C ; set PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/usr/pkg/bin:/usr/pkg/sbin; which perl
[2022-06-12 17:34:41] DEBUG - /usr/bin/perl
[2022-06-12 17:34:41] DEBUG - Executing df
[2022-06-12 17:34:41] DEBUG - Executing: df 2>/dev/null
[2022-06-12 17:34:41] DEBUG - Detecting shell...
[2022-06-12 17:34:41] DEBUG - Found shell in cache: tcsh
[2022-06-12 17:34:41] DEBUG - SSH/executing: set LC_ALL=C ; df 2>/dev/null
[2022-06-12 17:34:41] DEBUG - ========= ERR ============
[2022-06-12 17:34:41] DEBUG - df: 2: No such file or directory
[2022-06-12 17:34:41] DEBUG - ========= ERR ============
$VAR1 = {};
[2022-06-12 17:34:41] DEBUG - Running postponed notifications.
[2022-06-12 17:34:41] DEBUG - Connections in queue: 1
[2022-06-12 17:34:41] DEBUG - Destroying all cached os information
[2022-06-12 17:34:41] DEBUG - Waiting for children to finish
[2022-06-12 17:34:41] DEBUG - Need to reinitialize connections.
[2022-06-12 17:34:41] DEBUG - Returning existing distribution class of type: Rex::TaskList::Base
[2022-06-12 17:34:41] INFO - All tasks successful on all hosts
[2022-06-12 17:34:41] DEBUG - Removing lockfile
[2022-06-12 17:34:42] DEBUG - Returning existing distribution class of type: Rex::TaskList::Base
Process finished with exit code 0
Thanks for the debug logs, and for nicely formatting them as code! :heart: The full output can get quite long, so that might be easier to attach as file next time :wink:
Debug log analysis
As you found in the debug log correctly, the part leading to the error is:
- rex tries to execute
df $dev 2>/dev/null -
$devis empty so the full command becomesdf 2>/dev/null - and that fails with a
df: 2: No such file or directoryerror
The debug log also shows that rex found tcsh as a shell, and the problem seems to be that 2>/dev/null doesn't work on tcsh to redirect STDERR to /dev/null, but 2 is being passed to the df command as first argument.
In that sense it's related more to tcsh than to FreeBSD. I installed tcsh on Gentoo for a quick test and I can reproduce the error :tada: It might also worth to clarify in the issue title later for future readers.
Possible workarounds
- If possible, use a different shell than
tcshfor the user used to connect with rex. According to Wikipediatcshis the default shell forrootuser on FreeBSD, andshis default for regular users, so this would probably also work when connecting with a regular user.
Possible fix
There are a few code smells around the current execution logic:
- attempts to redirect STDERR o
/dev/null, but saving STDERR output into$erranyway - but then
$erris not used afterwards for anything
It's probably best to simplify that, and might also be possible to either:
- remove the non-portable redirect
- use a different way (like
i_run()) to execute the command (especially since Rex::Interface::Shell::Tcsh might help in abstracting shell details already) - introduce a condition to use a different command without the redirect when
tcshis detected
Risks
- A quick scan of the codebase finds 60 matches for the
2>string, so this kind of error might not be limited todfonly. Fixingdfmight be straightforward, but most probably a careful review of all similar call is warranted to prevent similar errors elsewhere. - Apparently this is a very old part of the codebase (introduced in 0.7.0) which has received several changes since then. It's probably best to review that history to better understand the motivation behind the current form.
AFAIK GitHub doesn't support attaching files natively, so pastebin services could be used/suggested for that purpose
(t)csh is (and always was IIRC) the default shell in FreeBSD so the Rex df command won't work on any version/flavor of FreeBSD as well because of that
I completely agree with all your points, and I appreciate your very much detailed response, prompt reaction, and the label being added 🙂
The debug log also shows that rex found tcsh as a shell, and the problem seems to be that 2>/dev/null doesn't work on tcsh to redirect STDERR to /dev/null, but 2 is being passed to the df command as first argument.
Unfortunately tcsh doesn't support 2> operator, and that's how standard error can be redirected^1 in tcsh:
$ (ls -l /bin/bash /bing/other > out) > & err
(direct stdout to 'out' and stderr to 'err')
Pasting/attaching
AFAIK GitHub doesn't support attaching files naively, so pastebin services could be used/suggested for that purpose
Pastebin services work too, though there's usually less control over their long-time availability, and using them also means scattering information-that-belongs-together across services. A GitHub gist might be a good choice to "stay close", but in general I prefer Attaching files directly to the issue to preserve the information long-term in a single place. (And then take an offline backup with something like git-bug or similar, but I digress).
tcsh
(t)csh is (and always was IIRC) the default shell in FreeBSD so the Rex
dfcommand won't work on any version/flavor of FreeBSD as well because of that
Yeah, I don't think I've encountered tcsh before elsewhere :) There's quite a few FreeBSD users and testers of Rex, so I'm curious why this issue didn't surface earlier, though :thinking:
We can think about introducing more test coverage perhaps (at least via CPAN testers, there's no native BSD option for GitHub Actions as far as I know, but there might some workarounds for that).
Details
I completely agree with all your points, and I appreciate your very much detailed response, prompt reaction, and the label being added
I'm glad you appreciate the detail. I tend to dump my thoughts as I go as reference for future readers (often myself :rofl:), and that often ends up highly detailed. I guess it's a case of "I didn't have time to write a shorter answer" :upside_down_face:
Redirect with tcsh
Unfortunately tcsh doesn't support
2>operator, and that's how standard error can be redirected1 in tcsh:$ (ls -l /bin/bash /bing/other > out) > & err (direct stdout to 'out' and stderr to 'err')
- https://code-maven.com/slides/linux/redirection-tcsh leftwards_arrow_with_hook
Yeah, that's the info I found too; thanks for confirming it and for the tips, and nice to see a code-maven link about it :+1:
Big picture
For the big picture I have these in mind currently:
- Since other parts of rex might be affected by the same issue, I'd like to go over these cases from the "redirect on tcsh" perspective, and see if there's a need and chance for a general fix.
- To avoid introducing potential regressions, I'd also check git history of
df()about what situations made previous authors to end up with this exact approach here. - I wonder if there's a good way to test the related behavior somehow, especially without having to actually run something on (Free)BSD, and prevent similar issues entering the code :thinking:
It might take some time until someone goes through all that, though.
Small picture :)
For this specific code part, I'd probably go for just dropping the STDERR handling, since that is ignored in the context of df() anyway. That is, no redirect, and drop the $err part completely if possible.
If df on tcsh is a bottleneck for your use case, it could be relatively easy to create an intermediate patch to apply to this limited scope (or perhaps even merge it, depending how hacky the approach is).