halc
halc copied to clipboard
High Throughput Algorithm for Long Read Error Correction
LATEST NEWS
The HALC paper is accepted for publication in BMC Bioinformatics!
Overview
HALC is software that makes error correction for long reads with high throughput.
Copy right
HALC is under the Artistic License 2.0.
Short manual
-
System requirements
HALC is suitable for 32-bit or 64-bit machines with Linux operating systems. At least 4GB of system memory is recommended for correcting larger data sets.
-
Installation
Aligner BLASR and error correction software LoRDEC (only for -ordinary mode) are required to run HALC.
- The source files in 'src' and 'thirdparty' folders can be compiled to generate a 'bin' folder by running Makefile:
make all. - Put BLASR, LoRDEC and the 'bin' folder to your $PATH:
export PATH=PATH2BLASR:$PATH,export PATH=PATH2LoRDEC:$PATHandexport PATH=PATH2bin:$PATH, respectively.
- The source files in 'src' and 'thirdparty' folders can be compiled to generate a 'bin' folder by running Makefile:
-
Inputs
- Long reads in FASTA format.
- Contigs assembled from the corresponding short reads in FASTA format.
- The initial short reads in FASTA format (only for -ordinary mode; obtained with
cat left_reads.fa >short_reads.faand thencat right_reads.fa >>short_reads.fa).
-
Using AlignGraph
runHALC.py long_reads.fa contigs.fa [-options|-options]Options (default value):
-o/-ordinary short_reads.fa (yes)
Ordinary mode utilizing repeats to make correction. The error correction software LoRDEC and the initial short reads are required to refine the repeat corrected regions. It is exclusive with the -repeat-free option.
-r/-repeat-free (no)
Repeat-free mode without utilizing repeats to make correction. It is exclusive with the -ordinary option.
-b/-boundary n (4)
Maximum boundary difference to split the subcontigs.
-a/-accurate (yes)
Accurate construction of the contig graph.
-c/-coverage n (auto)
Expected coverage on contigs. If not specified, it can be automatically calculated.
-w/-width n (4)
Maximum width of the dynamic programming table.
-k/-kmer n (25)
Kmer length for LoRDEC refinement.
-t/-threads n (auto)
Number of threads for one process to create. It is automatically set to the number of computing cores.
-l/-log (no)
System log to print. -
Outputs
- Error corrected full long reads.
- Error corrected trimmed long reads.
- Error corrected split long reads.
Chinese name
HALC's Chinese name is 浩克.