[BUG] continuous_factor errors out in _build_contrast
Describe the bug Setting continuous_factors in a DeseqDataSet exclusively causes an error in DeseqStats when building a contrast. However, if the same factor is included in the design_factors, it is converted to a Categorical type and works without error.
To Reproduce
dds = DeseqDataSet(
adata=adf,
design_factors=["treatment"],
continuous_factors=["time"],
ref_level=["treatment", "CTRL"],
)
stat_res_time = DeseqStats(dds, contrast=["time", "", ""])
The adf.obs.time.dtype is int64. This raises the following in the _build_contrast call:
The contrast variable ('time') should be one of the design factors.
Would just changing the check in _build_contrast in DeseqStats be enough?
pydeseq2 version: 0.4.11
Expected behavior The if statement should also check if the factor is in self.dds.continuous_factors
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information): Ubuntu 22.04
Additional context Add any other context about the problem here.
Hi @jeffhsu3,
Continuous factors must indeed also be listed as design factors. (This is the meaning - maybe not so clear - of the error message you get.)
I.e., in your case, the code should be changed to
dds = DeseqDataSet(
adata=adf,
design_factors=["treatment", "time"],
continuous_factors=["time"],
ref_level=["treatment", "CTRL"],
)
stat_res_time = DeseqStats(dds, contrast=["time", "", ""])
You mentioned that this caused time to be treated as a categorical factor, could you provide an example of this behaviour?
Thanks!
Thanks!
The design matrix isn't affected, but the obs df is.
print(adf.obs.time.dtype) # Output: dtype('int64')
# After creating DeseqDataSet
dds = DeseqDataSet(
adata=adf,
design_factors=["treatment", "time"],
continuous_factors=["time"],
ref_level=["treatment", "CTRL"],
)
print(dds.obs.time.dtype) # Output: dtype('O')
print(dds.obsm['design_matrix']) # Output: dtype('int64')
Closing this because of #328