Interpretation of the qualpos
How to interpret the qualpos output. We are seeing so many + symbol in the plot.
This is a box plot from matplotlib.pyplot. You can read about it here: http://matplotlib.org/api/pyplot_api.html
The line in the box is the median, the ends of the box are the 25th and 75th percentiles, the whiskers extend beyond to some set default. Other data points beyond the whiskers are considered outliers by default and show up as +.
The whiskers should probably just be set to show the range (min, max) of the data.
I made that code along time ago to show that the quality score distribution stays pretty much the same across the length of the reads. Be careful -- if you use it with all possible reads, then there sometimes is a subset of template reads that are really long (longer than the longest 2D and complements) and low quality, so it makes it look like the quality drops off at a certain length. However, if you plot the qual_v_pos for only those reads you can see the (low) quality stays the same across the length of the reads.
Dear JohnUrban Thanks. It's cool. one more query. so some reads (outliers reads) have higher quality Q>30 ?. if you see the outliers reads are very higher quality. How it's possible?
I am not sure I fully understand the question. Are you asking: How is it possible that some reads have Quality Scores >30?
qualpos is looking at the individual quality assigned to each base for every read (qualities vs position in read). Some bases get very high quality scores as determined by ONT. However, the mean quality score of a read (which is used for filtering 2D reads into pass and fail bins, for example) is usually found inside pretty predictable bounds. 1D mean quality scores are usually between something like 1-6 and 2D mean quality scores are usually between ~ 6-12. Maybe you find the individual outlier quality socres surprising because you are used to seeing the mean quality scores. (?)
I'd be happy to clarify further.