S4Vectors icon indicating copy to clipboard operation
S4Vectors copied to clipboard

undesirable promotion of SimpleLists to subclasses by endoapply

Open LTLA opened this issue 5 years ago • 2 comments

Consider the following example:

library(S4Vectors)
hits <- SimpleList(A=SelfHits(1,1,1), B=SelfHits(1,1,1))
## List of length 2
## names(2): A B

endoapply(hits, sort)
## SelfHitsList of length 2
## names(2): A B

Technically speaking, this is not wrong as endoapply guarantees that the return value is of the same class and a SelfHitsList is, after all, a SimpleList. But from a practical perspective, this is a bother as I cannot use the output of endoapply in the same way that I might use a SimpleList, e.g., because I can't assign non-SelfHits to it.

A closely related problem is that as(x, "SimpleList") doesn't actually create a SimpleList, but tends to try to promote things to the most specific form of a list that it can support. For example:

as(list(1,2), "SimpleList")
## NumericList of length 2
## [[1]] 1
## [[2]] 2

In fact, the behavior above is arguably incorrect because a NumericList isn't even a subclass of SimpleList.

The solution may be as simple as adding an argument to coerceToSimpleList to tell it to avoid aggressively promoting the output class within any coerce method towards a SimpleList.

Session information
R version 4.0.0 Patched (2020-05-01 r78341)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/R-4-0-branch-dev/lib/libRblas.so
LAPACK: /home/luna/Software/R/R-4-0-branch-dev/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] testthat_2.3.2              SingleCellExperiment_1.11.5
 [3] SummarizedExperiment_1.19.5 DelayedArray_0.15.4        
 [5] matrixStats_0.56.0          Matrix_1.2-18              
 [7] Biobase_2.49.0              GenomicRanges_1.41.5       
 [9] GenomeInfoDb_1.25.2         IRanges_2.23.10            
[11] S4Vectors_0.27.12           BiocGenerics_0.35.4        

loaded via a namespace (and not attached):
 [1] lattice_0.20-41        bitops_1.0-6           R6_2.4.1              
 [4] grid_4.0.0             magrittr_1.5           rlang_0.4.6           
 [7] zlibbioc_1.35.0        XVector_0.29.2         tools_4.0.0           
[10] RCurl_1.98-1.2         compiler_4.0.0         GenomeInfoDbData_1.2.3

LTLA avatar Jun 21 '20 00:06 LTLA

Changing this behavior would likely cause a lot of problems with existing code. If you want to create an instance of a specific class, then call its constructor, SimpleList() in this case.

Btw, in the NumericList example, it's just the simplified printing that is confusing. The object is actually a SimpleNumericList.

It may be worth investigating why as() even lets us do this, since the default strict=TRUE is supposed to prevent it.

lawremi avatar Jun 22 '20 17:06 lawremi

If you want to create an instance of a specific class, then call its constructor,

I think the same principle should apply to coercion: if I want to coerce to a specific class, then I should be able to coerce to that specific class. So if I want a SimpleNumericList instance, I do as(x, "SimpleNumericList"). If I want a SimpleList instance, I do as(x, "SimpleList"). This is the basic contract of coercion after all.

The only exception should be when the target class is virtual (e.g. NumericList), in which case as(x, class) is allowed to return an instance of a subclass of the specified class.

Another healthy property of coercion is idempotence i.e. as(as(x, "SomeClass"), "SomeClass") should be the same as as(x, "SomeClass") but we don't have this right now:

> class(as(list(1,2), "SimpleList"))
[1] "SimpleNumericList"
attr(,"package")
[1] "IRanges"

> class(as(as(list(1,2), "SimpleList"), "SimpleList"))
[1] "SimpleList"
attr(,"package")
[1] "S4Vectors"

hpages avatar Jun 22 '20 18:06 hpages