fastp icon indicating copy to clipboard operation
fastp copied to clipboard

Feature Request - report adapter names detected/removed

Open chrisgulvik opened this issue 5 years ago • 1 comments

You implemented two adapter removal processes (by overlap analysis and by adapter sequence), so thank you for all your efforts! I don't know of another read filtering package that reports the adapter names removed, but I think it would be valuable to log for routine work as a confirmation the

(1) correct adapters were removed and (2) off-target adapters were not removed.

This might not be an easy addition for reporting the overlap analysis searches, but I think it could be done with the by adapter sequence searches. You already store a std::map object of the sequences and their names.

Current output:

reads with adapter trimmed: 184542
bases trimmed due to adapters: 2575728

Example of ideal output:

reads with adapter trimmed: 184542
bases trimmed due to adapters: 2575728
adapters detected: 184542
  TruSeq_Adapter_Index_25: 184532
  I7_Primer_Nextera_XT_Index_Kit_v2_N720: 10

The HTML report currently just lists "other adapter sequences <int>" for R1 and R2, and could be expanded in a similar naming way as the output printed. As folks suggest options to change how adapters are detected (e.g., #256, #200, and #130), I think this will become more important.

chrisgulvik avatar May 28 '20 18:05 chrisgulvik

@sfchen Any thoughts on this reporting? I just ran into another case where this would be valuable (wrongly entered indices). A non-default if you're after speed for routine would be fine by me to add a counter and reporting.

chrisgulvik avatar Mar 21 '23 02:03 chrisgulvik