gzip: stdin: No data available
Hello. Below is a log from an FCS-GX run that crashed with the message gzip: stdin: No data available. What has happened here, and how to prevent this problem?
===============================================================================
Source: /mft-volume
Destination: /app/db/gxdb
Resuming failed transfer in /app/db/gxdb...
Space check: Available:1.14TiB; Existing:0B; Incoming:464.34GiB; Delta:464.34GiB
Requires transfer: 59B all.meta.jsonl
Copying /mft-volume/all.meta.jsonl to /app/db/gxdb/all.meta.jsonl.part...
Requires transfer: 187B all.README.txt
Copying /mft-volume/all.README.txt to /app/db/gxdb/all.README.txt.part...
Requires transfer: 6.09MiB all.taxa.tsv
Copying /mft-volume/all.taxa.tsv to /app/db/gxdb/all.taxa.tsv.part...
Requires transfer: 7.86MiB all.blast_div.tsv.gz
Copying /mft-volume/all.blast_div.tsv.gz to /app/db/gxdb/all.blast_div.tsv.gz.part...
Requires transfer: 8.48MiB all.assemblies.tsv
Copying /mft-volume/all.assemblies.tsv to /app/db/gxdb/all.assemblies.tsv.part...
Requires transfer: 21.51MiB all.seq_info.tsv.gz
Copying /mft-volume/all.seq_info.tsv.gz to /app/db/gxdb/all.seq_info.tsv.gz.part...
Requires transfer: 165.14GiB all.gxs
Copying /mft-volume/all.gxs to /app/db/gxdb/all.gxs.part...
Requires transfer: 299.16GiB all.gxi
Copying /mft-volume/all.gxi to /app/db/gxdb/all.gxi.part...
Done.
-----------------------------------------------------------------------------
tax-id : 476027
fasta : /sample-volume/assembly.fasta
size : 2495.09 MiB
split-fa : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gx_mapper_2955715/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^476027\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gx_mapper_2955715/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^476027\t']
BLAST-div : sponges
gx-div : anml:basal metazoans
w/same-tax: True
bin-dir : /app/bin
gx-db : /app/db/gxdb/gx_mapper_2955715/all.gxi
gx-ver : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD
output : /output-volume//assembly.476027.taxonomy.rpt
-----------------------------------------------------------------------------
####### args: Namespace(fasta='/sample-volume/assembly.fasta', tax_id=476027, species=None, split_fasta=True, div='anml:basal metazoans', gx_db='/app/db/gxdb/gx_mapper_2955715/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//assembly.476027', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//assembly.476027.taxonomy.rpt')
####### Starting process ['cat', '/sample-volume/assembly.fasta']
####### Starting process ['gzip', '-cdf']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=2616292917']
####### Starting process ['cat', '/sample-volume/assembly.fasta']
####### Starting process ['gzip', '-cdf']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--repeats-basis-fa=/dev/fd/5']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--output=/output-volume//assembly.476027.taxonomy.rpt.tmp', '--asserted-div=anml:basal metazoans', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
Using GX_PREFETCH=0
Collecting masking statistics...
Collected masking stats: 2.58323 Gbp; 30.8123s; 83.8376 Mbp/s. Baseline: 3.34072
gzip: stdin: No data available
####### Cleaning up process ['cat', '/sample-volume/assembly.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/assembly.fasta'])
####### Cleaning up process ['gzip', '-cdf']
Error: Process failed with retcode 1: ['gzip', '-cdf'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=2616292917']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--repeats-basis-fa=/dev/fd/5']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gx_mapper_2955715/all.gxi', '--output=/output-volume//assembly.476027.taxonomy.rpt.tmp', '--asserted-div=anml:basal metazoans', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['cat', '/sample-volume/assembly.fasta']
####### Cleaning up process ['gzip', '-cdf']
-----------------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
main()
File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
run_gx_pipeline(args)
File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
run(p_zcat_fasta, p_save_hits, p_main)
File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
self.wait()
File "/tmp/Bazel.runfiles_rs4oazym/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
assert num_errors == 0, "Had errors."
These are the software versions used for this run: OS: Ubuntu 22.04.4 LTS Singularity: v3.11.4 FCS image: 0.5.0 Python: 3.8.12 Platform: LSF
Following this as i'm having the same error.
Expanding slightly on @eeaunin 's comment above- this is currently a major problem for us at the Sanger. We've started seeing it after moving to a new farm cluster- it didn't seem to occur at all on the old cluster, but on the new one it happens perhaps half the time. Rerunning the same jobs often works the second time, so it seems to be an (apparently) randomly occurring intermittent fault- it doesn't seem like there's anything unusual about the input FASTA files. It happens for small assemblies and large ones; sometimes FCS-GX seems to be running for a while before this problem strikes.
We are currently attempting to reproduce this issue locally, and would appreciate any information that you think could be pertinent.
* Does this behavior change whether the input fasta is gzipped or not? Can you try multiple replicates to assess whether the outcome is intermittent or deterministic with respect to input format?
* Is it reproducible if the genome fasta file is initially copied to /tmp on the host, and then used as input?
@eeaunin do you have access to Docker, or just Singularity? If you do have both, can you test using Docker? @LaurenHuet @jt8-sanger - can you please provide more information about your run environment (OS, FCS image type, image version, job scheduler)
Hello, thanks for replying. I am using FCS-GX with uncompressed (not gzipped) assembly FASTA files. The problem appears to be intermittent: a retry of a crashed run with the same input files and same settings can seemingly randomly succeed or fail again. I'll investigate this more with multiple replicates. I haven't tried copying the genome FASTA file to /tmp. I have so far only run FCS-GX with Singularity. I don't know if I have access to Docker on the LSF. I'll find out if I have or not
I am using this with unzipped fasta files using singularity. I have had the same error 3 times in a row with the fasta. I have pulled the latest singularity container from the NCBI git. I am using SLURM on HPC (Pawsey super computer) I ran this across a batch of 60 genomes however only one is receiving this error. It seems to run to about 99% then fails with this. I have checked the fasta file for any issues, it is okay, it is a small genome.
`Prefetching /app/db/gxdb/gxdb/all.gxi 96%...
Prefetching /app/db/gxdb/gxdb/all.gxi 97%...
Prefetching /app/db/gxdb/gxdb/all.gxi 98%...
Prefetching /app/db/gxdb/gxdb/all.gxi 99%...
Prefetched /app/db/gxdb/gxdb/all.gxi in 1254.08s; 0.256136 GB/s. The file is 93% in RAM.
Collecting masking statistics...
gzip: stdin: No data available
Collected masking stats: 9.1081e-05 Gbp; 0.359757s; 0.253167 Mbp/s. Baseline: 1.02634
gzip: stdin: No data available
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG235.ilmn.230324.v129mh.fasta'])
Error: Process failed with retcode 1: ['gzip', '-cdf'])
-----------------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in <module>
main()
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1012, in main
run_gx_pipeline(args)
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 678, in run_gx_pipeline
run(p_zcat_fasta, p_save_hits, p_main)
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 278, in __exit__
self.wait()
File "/tmp/Bazel.runfiles_lwww7goj/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 268, in wait
assert num_errors == 0, "Had errors."
AssertionError: Had errors.
Traceback (most recent call last):
File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
sys.exit(main())
File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
gx.run()
File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
self.args.func(self)
File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
self.run_gx()
File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
self.safe_exec(docker_args)
File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
File "/software/projects/pawsey0812/singularity/miniconda3/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/lhuet/NOVA_230324_AD1/OG235/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG235.ilmn.230324.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898']' returned non-zero exit status 1.`
@etvedte: Regarding your question about my run environment- I'm working with @eeaunin on this, so the answers are the same as the ones he gives above.
We noticed that the Apptainer (formerly Singularity) changelog mentions addressing "no data available" errors, which could be related to the issue you are observing.
https://github.com/apptainer/apptainer/blob/main/CHANGELOG.md#other-changes
Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?
I have now tried multiple replicates. I used a Plasmodium chabaudi chabaudi assembly FASTA file as the input. The Singularity version that I was using was singularity-ce 3.11.4. Yesterday I did 10 runs with the same assembly FASTA file and same settings and all 10 completed successfully. Today I did another 10 runs with the same files and settings. 3 out of 10 crashed with the gzip: stdin: No data available error
Here are some things in response to the questions from a few days ago:
do you have access to Docker, or just Singularity? If you do have both, can you test using Docker?
I don't have proper access to Docker on the compute farm that I am using. There is a limited installation of Docker that doesn't allow writing results to disk. For production purposes I have to use Singularity.
Is it reproducible if the genome fasta file is initially copied to
/tmpon the host, and then used as input?
I have now tested running FCS-GX with and without copying the assembly FASTA file to /tmp but for some reason, the same error hasn't reappeared in the past 4 days. I ran FCS-GX in with the same Plasmodium chabaudi chabaudi assembly file that I mentioned before, 80 runs with the assembly FASTA copied to /tmp before the run and 80 runs without copying the assembly FASTA to /tmp. There were no crashes in either set of runs. I still have no idea what determines if the crashes with the gzip: stdin: No data available error happen or not.
Would it be possible for you attempt to reproduce the issue using the latest version of Apptainer?
The installation of Apptainer has been requested from the IT service desk but they haven't installed it yet
That's good to hear.
We are also working on a new patch release that may or may not help with this issue. I'll keep you posted when that's available.
Hello, I have version 4.1.0 of singularity and have pulled the latest version of the FCS-GX from the git page. I am using slurm job scheduler on Pawsey.
I have ran this 6 times across 35 genome assemblies, 9 of them have completed successfully, the rest continually error out with gzip: stdin: No data available.
When I first posted about this error, I was running it across 24 (different) genome assemblies with only 2 receiving this error.
I have seen that I am getting more errors with Illumina data vs Pacbio data.
`-----------------------------------------------------------------------------
tax-id : 7898 fasta : /sample-volume/OG193.ilmn.240313.v129mh.fasta size : 795.28 MiB split-fa : True ####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Starting process ['grep', '-E', '^7898\t'] ####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz'] ####### Cleaning up process ['grep', '-E', '^7898\t'] BLAST-div : bony fishes gx-div : anml:fishes w/same-tax: True bin-dir : /app/bin gx-db : /app/db/gxdb/gxdb/all.gxi gx-ver : Nov 27 2023 11:05:36; git:v0.5.0+branch--HEAD output : /output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt
####### args: Namespace(fasta='/sample-volume/OG193.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, ignore_same_kingdom=False, out_basename='/output-volume//OG193.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, gzip_c='gzip -c', out_taxonomy_rpt='/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt')
####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/app/bin/gx', 'split-fasta'] ####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Starting process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Starting process ['gzip', '-cdf'] ####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] Collecting masking statistics... Collected masking stats: 0.825914 Gbp; 9.98127s; 82.7463 Mbp/s. Baseline: 1.77974
gzip: stdin: No data available ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] Error: Process failed with retcode -13: ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta']) ####### Cleaning up process ['gzip', '-cdf'] Error: Process failed with retcode 1: ['gzip', '-cdf']) ####### Cleaning up process ['/app/bin/gx', 'split-fasta'] ####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=833913207'] ####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/5'] ####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG193.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv'] ####### Cleaning up process ['cat', '/sample-volume/OG193.ilmn.240313.v129mh.fasta'] ####### Cleaning up process ['gzip', '-cdf']
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_qu08_ukq/runfiles/cgr_fcs/apps/fcs_genome/public/run_gx/run_gx.py", line 1037, in
We have a new FCS v0.5.4 release that may resolve this gzip: stdin: No data available issue. Can you update the version you are using and re-test?
Hello, Thank you, i have tested FCS v0.5.4 across 26 of the same genomes i was using in my previous comment.
(I have version 4.1.0 of singularity and have pulled the latest version of the FCS-GX from the git page. I am using slurm job scheduler on Pawsey.)
On the first run 19 passed without error (2 failed due to time limit) 5 had the following error: ( all the same error)
`-----------------------------------------------------------------------------
tax-id : 7898
fasta : /sample-volume/OG82.ilmn.240313.v129mh.fasta
size : 731.01 MiB
split-fa : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^7898\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^7898\t']
BLAST-div : bony fishes
gx-div : anml:fishes
w/same-tax: True
bin-dir : /app/bin
gx-db : /app/db/gxdb/gxdb/all.gxi
gx-ver : Jun 18 2024 11:01:15; git:v0.5.4-8-g3c7c426
output : /output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt
-----------------------------------------------------------------------------
####### args: Namespace(fasta='/sample-volume/OG82.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, out_basename='/output-volume//OG82.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, ignore_same_kingdom=None, out_taxonomy_rpt='/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt')
#######
CPU count: 256
#######
/proc/meminfo:
MemTotal: 1056103484 kB
MemFree: 489411400 kB
MemAvailable: 979519848 kB
Buffers: 8480 kB
Cached: 510103948 kB
SwapCached: 0 kB
Active: 484861268 kB
Inactive: 25463892 kB
Active(anon): 487404 kB
Inactive(anon): 690676 kB
Active(file): 484373864 kB
Inactive(file): 24773216 kB
Unevictable: 8928 kB
Mlocked: 80 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 8 kB
Writeback: 0 kB
AnonPages: 220456 kB
Mapped: 138828 kB
Shmem: 970968 kB
KReclaimable: 2611608 kB
Slab: 47955104 kB
SReclaimable: 2611608 kB
SUnreclaim: 45343496 kB
KernelStack: 45680 kB
PageTables: 10072 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 528051740 kB
Committed_AS: 1767820 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 1508708 kB
VmallocChunk: 0 kB
Percpu: 525312 kB
HardwareCorrupted: 0 kB
AnonHugePages: 61440 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 1874536 kB
DirectMap2M: 28387328 kB
DirectMap1G: 1043333120 kB
#######
top:
Mem: 566692784K used, 489410700K free, 970968K shrd, 8480K buff, 510109012K cached
CPU: 0.0% usr 0.0% sys 0.0% nic 99.9% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 23.19 32.03 30.76 1/2600 93160
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
93158 93157 tpeirce R 1524 0.0 25 0.0 top -b
117702 1 root S 3398m 0.1 144 0.0 /usr/sbin/slurmd -D -s
####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/app/bin/gx', 'get-fasta-stats']
####### Starting process ['tee', '/dev/fd/6']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
####### Starting process ['pv', '--quiet', '--buffer-size=104857600']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
Collecting masking statistics...
Collected masking stats: 0.747621 Gbp; 10.1589s; 73.5923 Mbp/s. Baseline: 1.73023
####### Cleaning up process ['tee', '/dev/fd/6']
Error: Process failed with retcode 1: ['tee', '/dev/fd/6'])
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
####### Cleaning up process ['pv', '--quiet', '--buffer-size=104857600']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['/app/bin/gx', 'get-fasta-stats']
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
-----------------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1091, in <module>
main()
File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1066, in main
run_gx_pipeline(args)
File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 709, in run_gx_pipeline
with ProcessPipeline() as p_main:
File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 289, in __exit__
self.wait()
File "/tmp/Bazel.runfiles_bdt5byqf/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 279, in wait
assert num_errors == 0, "Had errors."
^^^^^^^^^^^^^^^
AssertionError: Had errors.
Traceback (most recent call last):
File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
sys.exit(main())
^^^^^^
File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
gx.run()
File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
self.args.func(self)
File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
self.run_gx()
File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
self.safe_exec(docker_args)
File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
File "/software/setonix/2024.05/software/linux-sles15-zen3/gcc-12.2.0/python-3.11.6-4ysxrvuaor6iljintmzcazlkfcokwnes/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/OG82/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/OG82/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG82.ilmn.240313.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898', '--debug']' returned non-zero exit status 1.`
On the second run 3 of the 5 ones with the error passed and 2 failed with the following error: (2 failed again due to time limit)
`-----------------------------------------------------------------------------
tax-id : 7898
fasta : /sample-volume/OG82.ilmn.240313.v129mh.fasta
size : 731.01 MiB
split-fa : True
####### Starting process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Starting process ['grep', '-E', '^7898\t']
####### Cleaning up process ['zcat', '-f', '/app/db/gxdb/gxdb/all.blast_div.tsv.gz']
####### Cleaning up process ['grep', '-E', '^7898\t']
BLAST-div : bony fishes
gx-div : anml:fishes
w/same-tax: True
bin-dir : /app/bin
gx-db : /app/db/gxdb/gxdb/all.gxi
gx-ver : Jun 18 2024 11:01:15; git:v0.5.4-8-g3c7c426
output : /output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt
-----------------------------------------------------------------------------
####### args: Namespace(fasta='/sample-volume/OG82.ilmn.240313.v129mh.fasta', tax_id=7898, species=None, split_fasta=True, div='anml:fishes', gx_db='/app/db/gxdb/gxdb/all.gxi', mask_transposons=None, bin_dir='/app/bin', allow_same_species=True, out_basename='/output-volume//OG82.ilmn.240313.v129mh.7898', out_dir='/output-volume/', action_report=True, save_hits=False, generate_logfile=False, debug=True, phone_home_label=None, gc_acc=None, gc_genomes_root_dir=None, production_build_name=None, ignore_same_kingdom=None, out_taxonomy_rpt='/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt')
#######
CPU count: 256
#######
/proc/meminfo:
MemTotal: 1056103484 kB
MemFree: 552111716 kB
MemAvailable: 831867096 kB
Buffers: 7268 kB
Cached: 299756428 kB
SwapCached: 0 kB
Active: 50246196 kB
Inactive: 418118744 kB
Active(anon): 678280 kB
Inactive(anon): 168927556 kB
Active(file): 49567916 kB
Inactive(file): 249191188 kB
Unevictable: 8928 kB
Mlocked: 80 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 4 kB
Writeback: 0 kB
AnonPages: 168610312 kB
Mapped: 164812 kB
Shmem: 1008952 kB
KReclaimable: 2681424 kB
Slab: 30897148 kB
SReclaimable: 2681424 kB
SUnreclaim: 28215724 kB
KernelStack: 47376 kB
PageTables: 479420 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 528051740 kB
Committed_AS: 171739260 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 1508724 kB
VmallocChunk: 0 kB
Percpu: 470016 kB
HardwareCorrupted: 0 kB
AnonHugePages: 127375360 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 2312808 kB
DirectMap2M: 79329280 kB
DirectMap1G: 991952896 kB
#######
top:
Mem: 503990456K used, 552113028K free, 1008952K shrd, 7268K buff, 299760140K cached
CPU: 6.3% usr 0.0% sys 0.0% nic 93.6% idle 0.0% io 0.0% irq 0.0% sirq
Load average: 17.34 17.82 17.73 17/2709 165178
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
229719229695 22120 R 18.7g 0.9 10 0.3 /software/projects/pawsey0265/jho/setonix/orca_5_0_3_linux_x86-64_shared_openmpi411/orca_mp2_mpi FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4.mp2inp.tmp FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4
229720229695 22120 R 18.6g 0.9 24 0.3 /software/projects/pawsey0265/jho/setonix/orca_5_0_3_linux_x86-64_shared_openmpi411/orca_mp2_mpi FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4.mp2inp.tmp FC9_CUTALL.r2scan.rev-DSD-PBEP86-D4
####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/app/bin/gx', 'get-fasta-stats']
####### Starting process ['tee', '/dev/fd/6']
####### Starting process ['/app/bin/gx', 'split-fasta']
####### Starting process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Starting process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
####### Starting process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
####### Starting process ['pv', '--quiet', '--buffer-size=104857600']
####### Starting process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
GX requires the database to be entirely in RAM to avoid thrashing.
Consider placing the database files in a non-swappable tmpfs or ramfs.
See https://github.com/ncbi/fcs/wiki/FCS-GX for details.
Will prefetch (vmtouch) the database pages to have the OS cache them in main memory.
refetching /app/db/gxdb/gxdb/all.gxi 93%...
Prefetching /app/db/gxdb/gxdb/all.gxi 94%...
Prefetching /app/db/gxdb/gxdb/all.gxi 95%...
Prefetching /app/db/gxdb/gxdb/all.gxi 96%...
Prefetching /app/db/gxdb/gxdb/all.gxi 97%...
Prefetching /app/db/gxdb/gxdb/all.gxi 98%...
Prefetching /app/db/gxdb/gxdb/all.gxi 99%...
Prefetched /app/db/gxdb/gxdb/all.gxi in 4411.67s; 0.0728106 GB/s. The file is 0% in RAM.
Collecting masking statistics...
Fatal error: Stream terminated abnormally.
Command exited with non-zero status 1
Command being timed: "nice -n19 /app/bin/gx align --gx-db=/app/db/gxdb/gxdb/all.gxi --repeats-basis-fa=/dev/fd/3"
User time (seconds): 48.23
System time (seconds): 1485.22
Percent of CPU this job got: 22%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1h 52m 01s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 887733552
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 117765165
Minor (reclaiming a frame) page faults: 3998362
Voluntary context switches: 143447
Involuntary context switches: 2400
Swaps: 0
File system inputs: 942766856
File system outputs: 1648
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1
Warning: missing header '##[["GX hits",2,1]]'
Fatal error: taxify.cpp:327 in make_run_info_json(...): Assertion failed: agg_cvg <= 1
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
Error: Process failed with retcode -13: ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta'])
####### Cleaning up process ['tee', '/dev/fd/6']
Error: Process failed with retcode -13: ['tee', '/dev/fd/6'])
####### Cleaning up process ['/app/bin/gx', 'split-fasta']
Error: Process failed with retcode -13: ['/app/bin/gx', 'split-fasta'])
####### Cleaning up process ['pv', '-Wbratpe', '--interval=0.5', '--size=766520091', '--buffer-size=104857600']
####### Cleaning up process ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3']
Error: Process failed with retcode 1: ['/busybox/time', '-v', 'nice', '-n19', '/app/bin/gx', 'align', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--repeats-basis-fa=/dev/fd/3'])
####### Cleaning up process ['pv', '--quiet', '--buffer-size=104857600']
####### Cleaning up process ['/app/bin/gx', 'taxify', '--gx-db=/app/db/gxdb/gxdb/all.gxi', '--output=/output-volume//OG82.ilmn.240313.v129mh.7898.taxonomy.rpt.tmp', '--asserted-div=anml:fishes', '--db-exclude-locs=/app/bin/db_exclude.locs.tsv']
####### Cleaning up process ['/app/bin/gx', 'get-fasta-stats']
####### Cleaning up process ['cat', '/sample-volume/OG82.ilmn.240313.v129mh.fasta']
-----------------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1091, in <module>
main()
File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 1066, in main
run_gx_pipeline(args)
File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 709, in run_gx_pipeline
with ProcessPipeline() as p_main:
File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 289, in __exit__
self.wait()
File "/tmp/Bazel.runfiles_892n3sya/runfiles/gdh_datasets/apps/fcs_genome/public/run_gx.py", line 279, in wait
assert num_errors == 0, "Had errors."
^^^^^^^^^^^^^^^
AssertionError: Had errors.
Traceback (most recent call last):
File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
sys.exit(main())
^^^^^^
File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
gx.run()
File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
self.args.func(self)
File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
self.run_gx()
File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
self.safe_exec(docker_args)
File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
File "/software/setonix/2024.05/software/linux-sles15-zen3/gcc-12.2.0/python-3.11.6-4ysxrvuaor6iljintmzcazlkfcokwnes/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['singularity', 'exec', '--bind', '/scratch/pawsey0812/lhuet/NCBI:/app/db/gxdb/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/troubleshoot/OG82/assemblies/genome:/sample-volume/', '--bind', '/scratch/pawsey0812/tpeirce/DRAFTGENOME/OUTPUT/troubleshoot/OG82/assemblies/genome/NCBI:/output-volume/', '/software/projects/pawsey0812/singularity/fcs-gx.sif', 'python3', '/app/bin/run_gx', '--fasta', '/sample-volume/OG82.ilmn.240313.v129mh.fasta', '--out-dir', '/output-volume/', '--gx-db', '/app/db/gxdb/gxdb', '--tax-id', '7898', '--debug']' returned non-zero exit status 1.
`
Now when trying to re-run them they error out with the following
Traceback (most recent call last):
File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
sys.exit(main())
File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
gx.run()
File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
self.args.func(self)
File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
self.run_gx()
File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
self.safe_exec(docker_args)
File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
File "/usr/lib64/python3.6/subprocess.py", line 423, in run
with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'text'
Planning to continue trying to run them in the coming week
This error is now appearing every time i run any of the fcs commands (for cleaning the genome and even trying to run the fcs adaptor)
Traceback (most recent call last):
File "/software/projects/pawsey0812/singularity/fcs.py", line 445, in <module>
sys.exit(main())
File "/software/projects/pawsey0812/singularity/fcs.py", line 434, in main
gx.run()
File "/software/projects/pawsey0812/singularity/fcs.py", line 345, in run
self.args.func(self)
File "/software/projects/pawsey0812/singularity/fcs.py", line 323, in run_screen_mode
self.run_gx()
File "/software/projects/pawsey0812/singularity/fcs.py", line 241, in run_gx
self.safe_exec(docker_args)
File "/software/projects/pawsey0812/singularity/fcs.py", line 166, in safe_exec
subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
File "/usr/lib64/python3.6/subprocess.py", line 423, in run
with Popen(*popenargs, **kwargs) as process:
TypeError: __init__() got an unexpected keyword argument 'text'
@LaurenHuet can you verify your Python version? FCS requires 3.7+. I see some of your older logs show python-3.11.6, but did your configuration change since then?
Can you also verify you are using the latest version of the fcs.py runner? grep DEFAULT_VERSION fcs.py?
Hey, its all working now thank you for your help. It was an issue with python switching versions on my system!
Good to hear! Since there don't appear to be any current issues in this thread I'm going to close. If anything else comes up, feel free to make a new GitHub issue.