simpleaf icon indicating copy to clipboard operation
simpleaf copied to clipboard

[question]: maximum available parallelism and number of threads

Open tamuanand opened this issue 1 year ago • 1 comments

Hi @rob-p

I am trying out this - https://divingintogeneticsandgenomics.com/post/how-to-use-salmon-alevin-to-preprocess-cite-seq-data/

# install via conda 
conda create -n af -y -c bioconda -c conda-forge simpleaf
conda activate af

# software installed
# simpleaf - 0.16.2
# salmon - 1.10.3
# alevin-fry - 0.9.0
# piscem - 0.8.0 

When it comes to this step,

simpleaf index \
--output $IDX_DIR \
--fasta $REF_DIR/fasta/genome.fa \
--gtf $REF_DIR/genes/genes.gtf \
--rlen 90 \
--threads 16 

I get this warn message - The maximum available parallelism is 1, but 16 threads were requested.

2024-04-01T03:13:01.179899Z  INFO simpleaf::simpleaf_commands::indexing: preparing to make reference with roers
2024-04-01T03:13:13.637065Z  INFO grangers::reader::gtf: Finished parsing the input file. Found 5 comments and 2765969 records.
2024-04-01T03:13:16.174268Z  INFO roers: Built the Grangers object for 2765969 records
2024-04-01T03:13:19.018054Z  INFO roers: Proceed 1305354 exon records from 199138 transcripts
2024-04-01T03:13:38.053968Z  INFO roers: Processing 1106216 intronic records
2024-04-01T03:14:06.180480Z  INFO roers: Done!
2024-04-01T03:14:06.201824Z  WARN simpleaf::simpleaf_commands::indexing: The maximum available parallelism is 1, but 16 threads were requested.
2024-04-01T03:14:06.201843Z  WARN simpleaf::simpleaf_commands::indexing: setting number of threads to 1

This is info from lscpu on the machine where I am running the above

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  48
  On-line CPU(s) list:   0-47
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
    CPU family:          6
    Model:               106
    Thread(s) per core:  2
    Core(s) per socket:  24
    Socket(s):           1
    Stepping:            6
    BogoMIPS:            5799.95
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
                          mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nop
                         l xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma
                          cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f
                         16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb s
                         tibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f
                          avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw a
                         vx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd ida arat avx512vbmi pku ospke a
                         vx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq
                          rdpid md_clear flush_l1d arch_capabilities
Virtualization features:
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):
  L1d:                   1.1 MiB (24 instances)
  L1i:                   768 KiB (24 instances)
  L2:                    30 MiB (24 instances)
  L3:                    54 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-47
Vulnerabilities:
  Gather data sampling:  Unknown: Dependent on hypervisor status
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT Host state unknown
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequen
                         ce
  Srbds:                 Not affected
  Tsx async abort:       Not affected

I am running this on a m6i.12xlarge - https://aws.amazon.com/ec2/instance-types/m6i/

Question: Would you know why this is happening - setting number of threads to 1

INFO simpleaf::simpleaf_commands::indexing: piscem build cmd : 
<path_to>/piscem build -k 31 -m 19 \
-o <path_to>/index/piscem_idx \
-s <path_to>/ref/roers_ref.fa --seed 1 --threads 1

Thanks in advance.

tamuanand avatar Apr 01 '24 03:04 tamuanand

Hi @tamuanand,

Very interesting — we may have to spin up an AWS instance to test this out. The hardware concurrency is evaluated using available_parallelism. While there are caveats in the documentation (there always are with querying such hardware capabilities), I don't see any that suggest why it might under count the available parallelism so much. In the worst case, we could, of course, add some flag to "force" the user-requested parallelism. However, my broader concern is how successful that will be if, for some reason, the available parallelism at the language level isn't being correctly detected. @DongzeHE and I will look into this and get back to you.

--Rob

rob-p avatar Apr 01 '24 14:04 rob-p