BACTpipe icon indicating copy to clipboard operation
BACTpipe copied to clipboard

Warn user if signalp is missing if prokka_gram is true

Open boulund opened this issue 5 years ago • 9 comments

I would like BACTpipe to detect if signalp is missing if prokka_gram is set to true, and warn the user that prokka requires signalp if combined with the prokka_gram option and perhaps instruct the user to run with --prokka_gram false if they want to proceed.

Now that I'm thinking of it, is prokka_gram really the best name for that config setting? Shouldn't it be --prokka_signal_peptides or --use-signalp instead, to make it more obvious to the user what it will affect?

What are your opinions @thorellk @abhi18av @emilio-r ?

boulund avatar Feb 01 '21 11:02 boulund

Yeah, I always thought it's not a nice name since it doesn't really convey anything to the end user. I think, prokka_signal_peptides makes more sense to end users 👍

abhi18av avatar Feb 01 '21 11:02 abhi18av

I would like BACTpipe to detect if signalp is missing ...

Hi team,

I'd like to confirm if there is a reliable cross-platform way we can check whether this signalp dependency exists in the env or not?

By default, I'm inclined to go with which signalp and then check within bash if this is null, but I'm not too confident with my bash-fu and there might be something more elegant :)

abhi18av avatar Feb 21 '21 17:02 abhi18av

A quick google led me here: https://stackoverflow.com/questions/592620/how-can-i-check-if-a-program-exists-from-a-bash-script

Sounds like just trying to run signalp would work well in our case as well, right?

I've seen variants of hash used before, but it's mentioned here as being a bash built-in, so maybe not as reliable.

boulund avatar Feb 22 '21 12:02 boulund

Okay, I'm just wondering whether this would be available in bioconda containers - which might be trimmed down in terms of packaged tools.

Let me check and circle back.

abhi18av avatar Feb 22 '21 15:02 abhi18av

@abhi18av Did you ever have time to check this? We can push this feature to the next release, so it's not a big deal right now.

boulund avatar May 05 '21 06:05 boulund

@boulund , not yet unfortunately 😞

I agree this could be added as a patch later.

But I did take a look yesterday and it seems that the which command is available within the bioconda containers, hash isn't.

Not sure how exactly to implement this feature. Perhaps a groovy function or NF utility process to check the presence of signalp, what do you suggest?

abhi18av avatar May 05 '21 11:05 abhi18av

I'm also not entirely sure what the best way to implement this is...

Maybe it's easiest to just do it in the bash code within the prokka process itself? E.g. in https://github.com/ctmrbio/BACTpipe/blob/master/modules/prokka/prokka.nf#L50-L65

Maybe something like this could work?

    # Fill in --gram argument if signalp is available
    if ${params.prokka_signal_peptides} && command -v signalp &> /dev/null; then
        prokka_gramstain_argument="${prokka_gramstain_argument}"
    else
        echo "WARNING: Signal peptides requested but signalp is not available, not running with --gram"
        prokka_gramstain_argument=""
    fi

    prokka \
        --cpus ${task.cpus} \
        --force \
        --evalue ${params.prokka_evalue} \
        --kingdom ${params.prokka_kingdom} \
        --locustag ${pair_id} \
        --outdir ${pair_id}_prokka \
        --prefix ${pair_id} \
        --strain ${pair_id} \
        ${prokka_reference_argument} \
        \${prokka_gramstain_argument} \
        ${prokka_genus_argument} \
        ${prokka_species_argument} \
        ${contigs_file}
    """

It won't produce a very visible warning message, however. It will only be shown in the stdout of the prokka process, if one knows that to look for.

Explanation of how I was thinking:

    if ${params.prokka_signal_peptides} && command -v signalp &> /dev/null; then
        prokka_gramstain_argument="${prokka_gramstain_argument}"

The value of params.prokka_signal_peptides will be either true or false, both are valid bash commands that have return codes corresponding to their name, so the first part of the if statement will evaluate to true if the user has requested the use of signalp in prokka. The second part will only run if the first part is true, and that will try to see if bash can find the command in $PATH and will evaluate to true if it can. Then it will set the bash variable prokka_gramstain_argument to the value from the variable with the same name in the Nextflow process namespace.

    else
        echo "WARNING: Signal peptides requested but signalp is not available, not running with --gram"
        prokka_gramstain_argument=""
    fi

If the statement above evaluates to false, this part will run and set the prokka_gramstain_argument to the empty string, meaning nothing related to --gram will be included in the argument list to prokka at all.

boulund avatar May 05 '21 11:05 boulund

Looking at it again I don't think this is the best idea. It might be better as you suggested to write a small Nextflow/groovy function to check for the presence of signalp before entering the bash section of the process definition. That would make it a lot easier to issue meaningful warning messages to the user as well. It's important that the function can check in the correct context (conda or docker etc). This must be a common problem, surely someone else in the Nextflow community must have solved something similar?

boulund avatar May 05 '21 11:05 boulund

This must be a common problem, surely someone else in the Nextflow community must have solved something similar?

Hmm, that's an interesting point. I'm not particularly sure about that 🤔

However, I'll ask around and try to come up with solution with the utility process in the meatime.

abhi18av avatar May 07 '21 17:05 abhi18av