fcs icon indicating copy to clipboard operation
fcs copied to clipboard

gzip: stdin: No data available

Open eeaunin opened this issue 1 year ago • 4 comments

Hello. Below is a log from an FCS-GX run that crashed with the message gzip: stdin: No data available. What has happened here, and how to prevent this problem?

=============================================================================== 
Source:      /mft-volume 
Destination: /app/db/gxdb 
Resuming failed transfer in /app/db/gxdb... 
Space check: Available:1.14TiB; Existing:0B; Incoming:464.34GiB; Delta:464.34GiB

Requires transfer: 59B all.meta.jsonl 
Copying /mft-volume/all.meta.jsonl to /app/db/gxdb/all.meta.jsonl.part... 

Requires transfer: 187B all.README.txt 
Copying /mft-volume/all.README.txt to /app/db/gxdb/all.README.txt.part... 

Requires transfer: 6.09MiB all.taxa.tsv 
Copying /mft-volume/all.taxa.tsv to /app/db/gxdb/all.taxa.tsv.part... 

Requires transfer: 7.86MiB all.blast_div.tsv.gz 
Copying /mft-volume/all.blast_div.tsv.gz to /app/db/gxdb/all.blast_div.tsv.gz.part... 

Requires transfer: 8.48MiB all.assemblies.tsv 
Copying /mft-volume/all.assemblies.tsv to /app/db/gxdb/all.assemblies.tsv.part... 

Requires transfer: 21.51MiB all.seq_info.tsv.gz 
Copying /mft-volume/all.seq_info.tsv.gz to /app/db/gxdb/all.seq_info.tsv.gz.part... 

Requires transfer: 165.14GiB all.gxs 
Copying /mft-volume/all.gxs to /app/db/gxdb/all.gxs.part... 

Requires transfer: 299.16GiB all.gxi 
Copying /mft-volume/all.gxi to /app/db/gxdb/all.gxi.part... 
Done. 
-----------------------------------------------------------------------------

tax-id    : 476027
fasta     : /sample-volume/assembly.fasta
size      : 2495.09 MiB
split-fa  : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gx_mapper_2955715/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^476027\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gx_mapper_2955715/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^476027\t']
BLAST-div : sponges
gx-div    : anml:basal metazoans
w/same-tax: True
bin-dir   : /app/bin
gx-db     : /app/db/gxdb/gx_mapper_2955715/all.gxi
gx-ver    : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD
output    : /output-volume//assembly.476027.taxonomy.rpt

-----------------------------------------------------------------------------

####### args: Namespace(fasta='/sample-volume/assembly.fasta', tax_id=476027, species=None, split_fasta=True, div='anml:basal metazoans', gx_db='/app/db/gxdb/gx_mapper_2955715/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//assembly.476027', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//assembly.476027.taxonomy.rpt') 

####### Starting process ['cat', '/sample-volume/assembly.fasta']
####### Starting process ['gzip', '-cdf']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=2616292917']
####### Starting process ['cat', '/sample-volume/assembly.fasta']
####### Starting process ['gzip', '-cdf']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--repeats-basis-fa=/dev/fd/5']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--output=/output-volume//assembly.476027.taxonomy.rpt.tmp', '--asserted-div=anml:basal metazoans', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
Using GX_PREFETCH=0
Collecting masking statistics...
Collected masking stats:  2.58323 Gbp; 30.8123s; 83.8376 Mbp/s. Baseline: 3.34072


gzip: stdin: No data available
####### Cleaning up process ['cat', '/sample-volume/assembly.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/assembly.fasta'])
####### Cleaning up process ['gzip', '-cdf']
Error: Process failed with retcode 1: ['gzip', '-cdf'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=2616292917']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--repeats-basis-fa=/dev/fd/5']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--output=/output-volume//assembly.476027.taxonomy.rpt.tmp', '--asserted-div=anml:basal metazoans', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['cat', '/sample-volume/assembly.fasta']
####### Cleaning up process ['gzip', '-cdf']

-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
    main()
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
    run_gx_pipeline(args)
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
    run(p_zcat_fasta, p_save_hits, p_main)
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
    self.wait()
  File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
    assert num_errors == 0, "Had errors."

These are the software versions used for this run: OS: Ubuntu 22.04.4 LTS Singularity: v3.11.4 FCS image: 0.5.0 Python: 3.8.12 Platform: LSF

eeaunin avatar May 09 '24 03:05 eeaunin

Following this as i'm having the same error.

LaurenHuet avatar May 10 '24 05:05 LaurenHuet

Expanding slightly on @eeaunin 's comment above- this is currently a major problem for us at the Sanger. We've started seeing it after moving to a new farm cluster- it didn't seem to occur at all on the old cluster, but on the new one it happens perhaps half the time. Rerunning the same jobs often works the second time, so it seems to be an (apparently) randomly occurring intermittent fault- it doesn't seem like there's anything unusual about the input FASTA files. It happens for small assemblies and large ones; sometimes FCS-GX seems to be running for a while before this problem strikes.

jt8-sanger avatar May 10 '24 14:05 jt8-sanger

We are currently attempting to reproduce this issue locally, and would appreciate any information that you think could be pertinent. * Does this behavior change whether the input fasta is gzipped or not? Can you try multiple replicates to assess whether the outcome is intermittent or deterministic with respect to input format? * Is it reproducible if the genome fasta file is initially copied to /tmp on the host, and then used as input?

@eeaunin do you have access to Docker, or just Singularity? If you do have both, can you test using Docker? @LaurenHuet @jt8-sanger - can you please provide more information about your run environment (OS, FCS image type, image version, job scheduler)

etvedte avatar May 10 '24 18:05 etvedte

Hello, thanks for replying. I am using FCS-GX with uncompressed (not gzipped) assembly FASTA files. The problem appears to be intermittent: a retry of a crashed run with the same input files and same settings can seemingly randomly succeed or fail again. I'll investigate this more with multiple replicates. I haven't tried copying the genome FASTA file to /tmp. I have so far only run FCS-GX with Singularity. I don't know if I have access to Docker on the LSF. I'll find out if I have or not

eeaunin avatar May 10 '24 21:05 eeaunin

I am using this with unzipped fasta files using singularity. I have had the same error 3 times in a row with the fasta. I have pulled the latest singularity container from the NCBI git. I am using SLURM on HPC (Pawsey super computer) I ran this across a batch of 60 genomes however only one is receiving this error. It seems to run to about 99% then fails with this. I have checked the fasta file for any issues, it is okay, it is a small genome.

`Prefetching /app/db/gxdb/gxdb/all.gxi 96%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 97%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 98%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 99%...                         
Prefetched /app/db/gxdb/gxdb/all.gxi in 1254.08s; 0.256136 GB/s. The file is 93% in RAM.
Collecting masking statistics...

gzip: stdin: No data available
Collected masking stats:  9.1081e-05 Gbp; 0.359757s; 0.253167 Mbp/s. Baseline: 1.02634


gzip: stdin: No data available
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])

-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
    main()
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
    run_gx_pipeline(args)
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
    run(p_zcat_fasta, p_save_hits, p_main)
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
    self.wait()
  File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
    assert num_errors == 0, "Had errors."
AssertionError: Had errors.
Traceback (most recent call last):
  File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
    sys.exit(main())
  File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
    gx.run()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
    self.args.func(self)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
    self.run_gx()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
    self.safe_exec(docker_args)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/software/projects/pawsey0812/singularity/miniconda3/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG235.ilmn.230324.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898']' returned non-zero exit status 1.`

LaurenHuet avatar May 13 '24 06:05 LaurenHuet

@etvedte: Regarding your question about my run environment- I'm working with @eeaunin on this, so the answers are the same as the ones he gives above.

jt8-sanger avatar May 13 '24 14:05 jt8-sanger

We noticed that the Apptainer (formerly Singularity) changelog mentions addressing "no data available" errors, which could be related to the issue you are observing.

https://github.com/apptainer/apptainer/blob/main/CHANGELOG.md#other-changes

Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?

etvedte avatar May 13 '24 15:05 etvedte

I have now tried multiple replicates. I used a Plasmodium chabaudi chabaudi assembly FASTA file as the input. The Singularity version that I was using was singularity-ce 3.11.4. Yesterday I did 10 runs with the same assembly FASTA file and same settings and all 10 completed successfully. Today I did another 10 runs with the same files and settings. 3 out of 10 crashed with the gzip: stdin: No data available error

eeaunin avatar May 14 '24 01:05 eeaunin

Here are some things in response to the questions from a few days ago:

do you have access to Docker, or just Singularity? If you do have both, can you test using Docker?

I don't have proper access to Docker on the compute farm that I am using. There is a limited installation of Docker that doesn't allow writing results to disk. For production purposes I have to use Singularity.

Is it reproducible if the genome fasta file is initially copied to /tmp on the host, and then used as input?

I have now tested running FCS-GX with and without copying the assembly FASTA file to /tmp but for some reason, the same error hasn't reappeared in the past 4 days. I ran FCS-GX in with the same Plasmodium chabaudi chabaudi assembly file that I mentioned before, 80 runs with the assembly FASTA copied to /tmp before the run and 80 runs without copying the assembly FASTA to /tmp. There were no crashes in either set of runs. I still have no idea what determines if the crashes with the gzip: stdin: No data available error happen or not.

Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?

The installation of Apptainer has been requested from the IT service desk but they haven't installed it yet

eeaunin avatar May 19 '24 07:05 eeaunin

That's good to hear.

We are also working on a new patch release that may or may not help with this issue. I'll keep you posted when that's available.

etvedte avatar May 20 '24 16:05 etvedte

Hello, I have version 4.1.0 of singularity and have pulled the latest version of the FCS-GX from the git page. I am using slurm job scheduler on Pawsey.

I have ran this 6 times across 35 genome assemblies, 9 of them have completed successfully, the rest continually error out with gzip: stdin: No data available.

When I first posted about this error, I was running it across 24 (different) genome assemblies with only 2 receiving this error.

I have seen that I am getting more errors with Illumina data vs Pacbio data.

`-----------------------------------------------------------------------------

tax-id : 7898 fasta : /sample-volume/OG193.ilmn.240313.v129mh.fasta size : 795.28 MiB split-fa : True ####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Starting process ['grep', '-E', '^7898\t'] ####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Cleaning up process ['grep', '-E', '^7898\t'] BLAST-div : bony fishes gx-div : anml:fishes w/same-tax: True bin-dir : /app/bin gx-db : /app/db/gxdb/gxdb/all.gxi gx-ver : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD output : /output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt


####### args: Namespace(fasta='/sample-volume/OG193.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//OG193.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt')

####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/app/bin/gx', 'split-fasta'] ####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] Collecting masking statistics... Collected masking stats: 0.825914 Gbp; 9.98127s; 82.7463 Mbp/s. Baseline: 1.77974

gzip: stdin: No data available ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] Error: Process failed with retcode -13: ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta']) ####### Cleaning up process ['gzip', '-cdf'] Error: Process failed with retcode 1: ['gzip', '-cdf']) ####### Cleaning up process ['/app/bin/gx', 'split-fasta'] ####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Cleaning up process ['gzip', '-cdf']


Traceback (most recent call last): File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in main() File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main run_gx_pipeline(args) File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline run(p_zcat_fasta, p_save_hits, p_main) File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in exit self.wait() File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait assert num_errors == 0, "Had errors." AssertionError: Had errors. Traceback (most recent call last): File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in sys.exit(main()) ^^^^^^ File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main gx.run() File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run self.args.func(self) File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode self.run_gx() File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx self.safe_exec(docker_args) File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr) File "/software/setonix/2024.05/software/linux-sles15-zen3/gcc-12.2.0/python-3.11.6-4ysxrvuaor6iljintmzcazlkfcokwnes/lib/python3.11/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '`

LaurenHuet avatar Jun 10 '24 01:06 LaurenHuet

We have a new FCS v0.5.4 release that may resolve this gzip: stdin: No data available issue. Can you update the version you are using and re-test?

etvedte avatar Jun 26 '24 14:06 etvedte

Hello, Thank you, i have tested FCS v0.5.4 across 26 of the same genomes i was using in my previous comment.

(I have version 4.1.0 of singularity and have pulled the latest version of the FCS-GX from the git page. I am using slurm job scheduler on Pawsey.)

On the first run 19 passed without error (2 failed due to time limit) 5 had the following error: ( all the same error)

`-----------------------------------------------------------------------------

tax-id    : 7898
fasta     : /sample-volume/OG82.ilmn.240313.v129mh.fasta
size      : 731.01 MiB
split-fa  : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^7898\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^7898\t']
BLAST-div : bony fishes
gx-div    : anml:fishes
w/same-tax: True
bin-dir   : /app/bin
gx-db     : /app/db/gxdb/gxdb/all.gxi
gx-ver    : Jun 18 2024 11:01:15; git:v0.5.4-8-g3c7c426
output    : /output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt

-----------------------------------------------------------------------------

####### args: Namespace(fasta='/sample-volume/OG82.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, out_basename='/output-volume//OG82.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, ignore_same_kingdom=None, out_taxonomy_rpt='/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt') 

####### 
CPU count:  256
####### 
/proc/meminfo:
MemTotal:       1056103484 kB
MemFree:        489411400 kB
MemAvailable:   979519848 kB
Buffers:            8480 kB
Cached:         510103948 kB
SwapCached:            0 kB
Active:         484861268 kB
Inactive:       25463892 kB
Active(anon):     487404 kB
Inactive(anon):   690676 kB
Active(file):   484373864 kB
Inactive(file): 24773216 kB
Unevictable:        8928 kB
Mlocked:              80 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 8 kB
Writeback:             0 kB
AnonPages:        220456 kB
Mapped:           138828 kB
Shmem:            970968 kB
KReclaimable:    2611608 kB
Slab:           47955104 kB
SReclaimable:    2611608 kB
SUnreclaim:     45343496 kB
KernelStack:       45680 kB
PageTables:        10072 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    528051740 kB
Committed_AS:    1767820 kB
VmallocTotal:   34359738367 kB
VmallocUsed:     1508708 kB
VmallocChunk:          0 kB
Percpu:           525312 kB
HardwareCorrupted:     0 kB
AnonHugePages:     61440 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:     1874536 kB
DirectMap2M:    28387328 kB
DirectMap1G:    1043333120 kB

####### 
top:
Mem: 566692784K used, 489410700K free, 970968K shrd, 8480K buff, 510109012K cached
CPU:  0.0% usr  0.0% sys  0.0% nic 99.9% idle  0.0% io  0.0% irq  0.0% sirq
Load average: 23.19 32.03 30.76 1/2600 93160
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
93158 93157 tpeirce  R     1524  0.0  25  0.0 top -b
117702     1 root     S    3398m  0.1 144  0.0 /usr/sbin/slurmd -D -s

####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/app/bin/gx', 'get-fasta-stats']
####### Starting process ['tee', '/dev/fd/6']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
####### Starting process ['pv', '--quiet', '--buffer-size=104857600']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
Collecting masking statistics...
Collected masking stats:  0.747621 Gbp; 10.1589s; 73.5923 Mbp/s. Baseline: 1.73023

####### Cleaning up process ['tee', '/dev/fd/6']
Error: Process failed with retcode 1: ['tee', '/dev/fd/6'])
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
####### Cleaning up process ['pv', '--quiet', '--buffer-size=104857600']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['/app/bin/gx', 'get-fasta-stats']
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']

-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1091, in <module>
    main()
  File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1066, in main
    run_gx_pipeline(args)
  File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 709, in run_gx_pipeline
    with ProcessPipeline() as p_main:
  File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 289, in __exit__
    self.wait()
  File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 279, in wait
    assert num_errors == 0, "Had errors."
           ^^^^^^^^^^^^^^^
AssertionError: Had errors.
Traceback (most recent call last):
  File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
    sys.exit(main())
             ^^^^^^
  File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
    gx.run()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
    self.args.func(self)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
    self.run_gx()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
    self.safe_exec(docker_args)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/software/setonix/2024.05/software/linux-sles15-zen3/gcc-12.2.0/python-3.11.6-4ysxrvuaor6iljintmzcazlkfcokwnes/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/OG82/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/OG82/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG82.ilmn.240313.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898', '--debug']' returned non-zero exit status 1.`

On the second run 3 of the 5 ones with the error passed and 2 failed with the following error: (2 failed again due to time limit)

`-----------------------------------------------------------------------------

tax-id    : 7898
fasta     : /sample-volume/OG82.ilmn.240313.v129mh.fasta
size      : 731.01 MiB
split-fa  : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^7898\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^7898\t']
BLAST-div : bony fishes
gx-div    : anml:fishes
w/same-tax: True
bin-dir   : /app/bin
gx-db     : /app/db/gxdb/gxdb/all.gxi
gx-ver    : Jun 18 2024 11:01:15; git:v0.5.4-8-g3c7c426
output    : /output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt

-----------------------------------------------------------------------------

####### args: Namespace(fasta='/sample-volume/OG82.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, out_basename='/output-volume//OG82.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, ignore_same_kingdom=None, out_taxonomy_rpt='/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt') 

####### 
CPU count:  256
####### 
/proc/meminfo:
MemTotal:       1056103484 kB
MemFree:        552111716 kB
MemAvailable:   831867096 kB
Buffers:            7268 kB
Cached:         299756428 kB
SwapCached:            0 kB
Active:         50246196 kB
Inactive:       418118744 kB
Active(anon):     678280 kB
Inactive(anon): 168927556 kB
Active(file):   49567916 kB
Inactive(file): 249191188 kB
Unevictable:        8928 kB
Mlocked:              80 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 4 kB
Writeback:             0 kB
AnonPages:      168610312 kB
Mapped:           164812 kB
Shmem:           1008952 kB
KReclaimable:    2681424 kB
Slab:           30897148 kB
SReclaimable:    2681424 kB
SUnreclaim:     28215724 kB
KernelStack:       47376 kB
PageTables:       479420 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    528051740 kB
Committed_AS:   171739260 kB
VmallocTotal:   34359738367 kB
VmallocUsed:     1508724 kB
VmallocChunk:          0 kB
Percpu:           470016 kB
HardwareCorrupted:     0 kB
AnonHugePages:  127375360 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:     2312808 kB
DirectMap2M:    79329280 kB
DirectMap1G:    991952896 kB

####### 
top:
Mem: 503990456K used, 552113028K free, 1008952K shrd, 7268K buff, 299760140K cached
CPU:  6.3% usr  0.0% sys  0.0% nic 93.6% idle  0.0% io  0.0% irq  0.0% sirq
Load average: 17.34 17.82 17.73 17/2709 165178
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
229719229695 22120    R    18.7g  0.9  10  0.3 /software/projects/pawsey0265/jho/setonix/orca_5_0_3_linux_x86-64_shared_openmpi411/orca_mp2_mpi FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4.mp2inp.tmp FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4
229720229695 22120    R    18.6g  0.9  24  0.3 /software/projects/pawsey0265/jho/setonix/orca_5_0_3_linux_x86-64_shared_openmpi411/orca_mp2_mpi FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4.mp2inp.tmp FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4

####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/app/bin/gx', 'get-fasta-stats']
####### Starting process ['tee', '/dev/fd/6']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
####### Starting process ['pv', '--quiet', '--buffer-size=104857600']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']

    GX requires the database to be entirely in RAM to avoid thrashing.
    Consider placing the database files in a non-swappable tmpfs or ramfs.
    See https://github.com/ncbi/fcs/wiki/FCS-GX for details.
    Will prefetch (vmtouch) the database pages to have the OS cache them in main memory.
refetching /app/db/gxdb/gxdb/all.gxi 93%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 94%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 95%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 96%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 97%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 98%...                         
Prefetching /app/db/gxdb/gxdb/all.gxi 99%...                         
Prefetched /app/db/gxdb/gxdb/all.gxi in 4411.67s; 0.0728106 GB/s. The file is 0% in RAM.
Collecting masking statistics...
Fatal error: Stream  terminated abnormally.
Command exited with non-zero status 1
	Command being timed: "nice -n19 /app/bin/gx align --gx-db=/app/db/gxdb/gxdb/all.gxi --repeats-basis-fa=/dev/fd/3"
	User time (seconds): 48.23
	System time (seconds): 1485.22
	Percent of CPU this job got: 22%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 1h 52m 01s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 887733552
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 117765165
	Minor (reclaiming a frame) page faults: 3998362
	Voluntary context switches: 143447
	Involuntary context switches: 2400
	Swaps: 0
	File system inputs: 942766856
	File system outputs: 1648
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 1
Warning: missing header '##[["GX hits",2,1]]'
Fatal error: taxify.cpp:327 in make_run_info_json(...): Assertion failed: agg_cvg <= 1
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta'])
####### Cleaning up process ['tee', '/dev/fd/6']
Error: Process failed with retcode -13: ['tee', '/dev/fd/6'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
Error: Process failed with retcode -13: ['/app/bin/gx', 'split-fasta'])
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
Error: Process failed with retcode 1: ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3'])
####### Cleaning up process ['pv', '--quiet', '--buffer-size=104857600']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['/app/bin/gx', 'get-fasta-stats']
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']

-----------------------------------------------------------------------------

Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1091, in <module>
    main()
  File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1066, in main
    run_gx_pipeline(args)
  File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 709, in run_gx_pipeline
    with ProcessPipeline() as p_main:
  File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 289, in __exit__
    self.wait()
  File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 279, in wait
    assert num_errors == 0, "Had errors."
           ^^^^^^^^^^^^^^^
AssertionError: Had errors.
Traceback (most recent call last):
  File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
    sys.exit(main())
             ^^^^^^
  File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
    gx.run()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
    self.args.func(self)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
    self.run_gx()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
    self.safe_exec(docker_args)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/software/setonix/2024.05/software/linux-sles15-zen3/gcc-12.2.0/python-3.11.6-4ysxrvuaor6iljintmzcazlkfcokwnes/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/troubleshoot/OG82/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/troubleshoot/OG82/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG82.ilmn.240313.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898', '--debug']' returned non-zero exit status 1.
`

Now when trying to re-run them they error out with the following

Traceback (most recent call last):
  File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
    sys.exit(main())
  File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
    gx.run()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
    self.args.func(self)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
    self.run_gx()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
    self.safe_exec(docker_args)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/usr/lib64/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'text'

Planning to continue trying to run them in the coming week

LaurenHuet avatar Jul 05 '24 07:07 LaurenHuet

This error is now appearing every time i run any of the fcs commands (for cleaning the genome and even trying to run the fcs adaptor)

Traceback (most recent call last):
  File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
    sys.exit(main())
  File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
    gx.run()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
    self.args.func(self)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
    self.run_gx()
  File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
    self.safe_exec(docker_args)
  File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/usr/lib64/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'text'

LaurenHuet avatar Jul 09 '24 01:07 LaurenHuet

@LaurenHuet can you verify your Python version? FCS requires 3.7+. I see some of your older logs show python-3.11.6, but did your configuration change since then?

Can you also verify you are using the latest version of the fcs.py runner? grep DEFAULT_VERSION fcs.py?

etvedte avatar Jul 10 '24 14:07 etvedte

Hey, its all working now thank you for your help. It was an issue with python switching versions on my system!

LaurenHuet avatar Jul 16 '24 01:07 LaurenHuet

Good to hear! Since there don't appear to be any current issues in this thread I'm going to close. If anything else comes up, feel free to make a new GitHub issue.

etvedte avatar Jul 16 '24 12:07 etvedte