Gemma icon indicating copy to clipboard operation
Gemma copied to clipboard

quantitation type validation error

Open ppavlidis opened this issue 2 years ago • 3 comments

For an RNA-seq data set: GSE228497 while running rnaseqDataAdd

This triggers warnings, not an exception, thankfully. These data are not ratiometric and are definitely on a log2 scale, and looks fine in the diagnostics. This is related to general complaints I have that curators often have to suppress these checks when running makeProcessedData.

2023-11-30 10:28:31,197 WARN 30222 [main] u.g.c.d.m.ExpressionDataDoubleMatrixUtil.ensureLog2Scale(152) | The scale LOG2 differs from the one inferred from data: LOGBASEUNKNOWN.
2023-11-30 10:28:31,198 WARN 30222 [main] u.g.c.d.m.ExpressionDataDoubleMatrixUtil.ensureLog2Scale(162) | The expression data appears to ratiometric, but the quantitation says otherwise.

image

ppavlidis avatar Nov 30 '23 18:11 ppavlidis

I'm seeing this for many other data sets as well like GSE228997, GSE228156 - maybe it is happening always, not sure

ppavlidis avatar Nov 30 '23 18:11 ppavlidis

We shouldn't raise a warning if the inferred scale is LOGBASEUNKNOWN and the assigned one is some specific log-scale.

As for the ratio metric part, it means that the mean is extremely close to zero. I can relax the threshold a little bit.

arteymix avatar Feb 10 '24 02:02 arteymix

We really need to eliminate the excessive false positives. This is for an RNA-seq experiment where we always have control over the scaling. If this is unavoidable, then disable all checks for RNA-seq data. The quantitation type ambiguities are all for microarray data sets.

Having to remember to pass -ignoreqm to makeProcessedData is annoying and defeats the purpose of the checks.

Error while processing ExpressionExperiment Id=35086 Name=Genotype-by-diet interactions determine susceptibility and resistance in T2D mouse models [Hypothalamus] Short Name=GSE235479:
QuantitationType Id=591666 Name=log2cpm General Type=QUANTITATIVE Type=AMOUNT Scale=LOG2 Representation=DOUBLE [Recomputed From Raw] [Preferred]:
        The scale LOG2 differs from the one inferred from data: LOGBASEUNKNOWN.

ppavlidis avatar May 16 '24 23:05 ppavlidis