quantitation type validation error
For an RNA-seq data set: GSE228497 while running rnaseqDataAdd
This triggers warnings, not an exception, thankfully. These data are not ratiometric and are definitely on a log2 scale, and looks fine in the diagnostics. This is related to general complaints I have that curators often have to suppress these checks when running makeProcessedData.
2023-11-30 10:28:31,197 WARN 30222 [main] u.g.c.d.m.ExpressionDataDoubleMatrixUtil.ensureLog2Scale(152) | The scale LOG2 differs from the one inferred from data: LOGBASEUNKNOWN.
2023-11-30 10:28:31,198 WARN 30222 [main] u.g.c.d.m.ExpressionDataDoubleMatrixUtil.ensureLog2Scale(162) | The expression data appears to ratiometric, but the quantitation says otherwise.
I'm seeing this for many other data sets as well like GSE228997, GSE228156 - maybe it is happening always, not sure
We shouldn't raise a warning if the inferred scale is LOGBASEUNKNOWN and the assigned one is some specific log-scale.
As for the ratio metric part, it means that the mean is extremely close to zero. I can relax the threshold a little bit.
We really need to eliminate the excessive false positives. This is for an RNA-seq experiment where we always have control over the scaling. If this is unavoidable, then disable all checks for RNA-seq data. The quantitation type ambiguities are all for microarray data sets.
Having to remember to pass -ignoreqm to makeProcessedData is annoying and defeats the purpose of the checks.
Error while processing ExpressionExperiment Id=35086 Name=Genotype-by-diet interactions determine susceptibility and resistance in T2D mouse models [Hypothalamus] Short Name=GSE235479:
QuantitationType Id=591666 Name=log2cpm General Type=QUANTITATIVE Type=AMOUNT Scale=LOG2 Representation=DOUBLE [Recomputed From Raw] [Preferred]:
The scale LOG2 differs from the one inferred from data: LOGBASEUNKNOWN.