plass icon indicating copy to clipboard operation
plass copied to clipboard

Error running mpi job with class

Open liuxianghui opened this issue 5 years ago • 1 comments

Dear sir: I have no problem running a single plass job on my Linux cluster. However, I want to try for mpi plass job. Here are the details for the input and logs files. I specify two nodes with each 32 cores with PBS -l nodes=2:ppn=32. In the job, I specify mpirun -np 64. I ran into error with writting to /tmp file. The problem also happens if I specify a local tmp file folder. Please kindly suggest,

The sh file is like this: #!/bin/bash #PBS -N mpi

name you want to give your job

the default output file will use this

#PBS -q std

specify the queue you want to use

#PBS -l nodes=2:ppn=32 #PBS -j oe #PBS -o logs

=======================================================

LOAD PBS MODULES

=======================================================

cd $PBS_O_WORKDIR

module load openmpi3/gcc/64/3.1.4 module load pbs

#cd $PBS_O_WORKDIR

##export OMP_NUM_THREADS=1

#mpirun -np 42 plass assemble examples/reads_1.fastq.gz examples/reads_2.fastq.gz assembly.fas tmp

#plass assemble examples/reads_1.fastq.gz examples/reads_2.fastq.gz assembly.fas tmp for bin in $(cat list2) do echo ${bin} mpirun -np 64 plass assemble $bin'_tr_hostout_R1.fastq' $bin'_tr_hostout_R2.fastq' $bin'.assembly.plass.protein.fas' /tmp #plass nuclassemble $bin'_tr_hostout_R1.fastq' $bin'_tr_hostout_R2.fastq' $bin'.assembly.plass.nuc.fas' tmp echo "###Assembly Sample" $bin" Start###" date done

list2 file: P016_S9

The logs file is like the below:

P016_S9 assemble P016_S9_tr_hostout_R1.fastq P016_S9_tr_hostout_R2.fastq P016_S9.assembly.plass.protein.fas /tmp

assemble P016_S9_tr_hostout_R1.fastq P016_S9_tr_hostout_R2.fastq P016_S9.assembly.plass.protein.fas /tmp

assemble P016_S9_tr_hostout_R1.fastq P016_S9_tr_hostout_R2.fastq P016_S9.assembly.plass.protein.fas /tmp

assemble P016_S9_tr_hostout_R1.fastq P016_S9_tr_hostout_R2.fastq P016_S9.assembly.plass.protein.fas /tmp

assemble P016_S9_tr_hostout_R1.fastq P016_S9_tr_hostout_R2.fastq P016_S9.assembly.plass.protein.fas /tmp

assemble P016_S9_tr_hostout_R1.fastq P016_S9_tr_hostout_R2.fastq P016_S9.assembly.plass.protein.fas /tmp

MMseqs Version: 3.764a3 Substitution matrix nucl:nucleotide.out,aa:blosum62.out Rescore mode 3 Allow wrapped scoring false Remove hits by seq. id. and coverage false E-value threshold 1e-05 Coverage threshold 0 Add backtrace MMseqs Version: 3.764a3 Substitution matrix nucl:nucleotide.out,aa:blosum62.out Rescore mode 3 Allow wrapped scoring false Remove hits by seq. id. and coverage false E-value threshold 1e-05 Coverage threshold 0 Add backtrace false Coverage mode 0 Seq. id. threshold 0.9 Min. alignment length 0 Seq. id. mode 0 Include identical seq. id. false Sort results 0 Preload mode 0 Threads 64 Compressed 0 Verbosity 3 Alphabet size 13 K-mers per sequence 60 scale k-mers per sequence 0 Adjust k-mer length false Mask residues 0 Mask lower case residues 0 K-mer size 14 Max sequence length 65535 Shift hash 5 Split memory limit 0 Include only extendable true Skip repeating k-mers true Min codons in orf 45 Max codons in length 32734 Max orf gaps 2147483647 Could not delete /tmp/latest! Could not create symlink of /tmp/531455983002076514! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not write file /tmp/531455983002076514/assembler.sh! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not write file /tmp/531455983002076514/assembler.sh! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not delete /tmp/latest! Could not delete /tmp/latest!

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.

Plass Output (for bugs)

Please make sure to also post the complete output of Plass. You can use gist.github.com for large output.

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

  • Git commit used (The string after "Plass Version:" when you execute Plass without any parameters):
  • Which Plass version was used (Statically-compiled, self-compiled, Homebrew, etc.):
  • For self-compiled and Homebrew: Compiler and Cmake versions used and their invocation:
  • Server specifications (especially CPU support for AVX2/SSE and amount of system memory):
  • Operating system and version:

liuxianghui avatar Jun 10 '20 11:06 liuxianghui

Thank you for reporting this issues. The MPI is currently broken in PLASS, please use the single version in the mean time.

martin-steinegger avatar Jun 15 '20 09:06 martin-steinegger