TCGAWorkflow icon indicating copy to clipboard operation
TCGAWorkflow copied to clipboard

GAIA aberrant region description

Open PubuduSaneth opened this issue 7 years ago • 1 comments

I followed the TCGAWorkflow to run GAIA using TCGA Malignant melanoma (SKCM) level 3 segment data. According to GAIA load_cnv documentation, estimated copy number for segmented regions (kind of aberrations) are 0, 1 and 2 for losses, LOHs and gains. However, in TCGAWorkflow section "Identification of recurrent CNV in cancer", cnvMatrix contains 0s for losses and 1s for gains.

# Add label (0 for loss, 1 for gain)
cnvMatrix <- cbind(cnvMatrix,Label=NA)
cnvMatrix[cnvMatrix[,"Segment_Mean"] < -0.3,"Label"] <- 0
cnvMatrix[cnvMatrix[,"Segment_Mean"] > 0.3,"Label"] <- 1
cnvMatrix <- cnvMatrix[!is.na(cnvMatrix$Label),]

It would be extremely helpful if you can clarify the reason to deviate from GAIA documentation or let me know whether I have misunderstood the TCGAWorkflow.

PubuduSaneth avatar Jul 06 '18 10:07 PubuduSaneth

Sorry for the delay. This section was entirely written by Fulvio. Here is his answer:

“In the pipeline we described for the identification of recurrent CNV in cancer, we considered only two aberrations: gain, defined as log2(copy-number/ 2)>0.3, and loss, defined as log2(copy-number/ 2)<-0.3. So, according to GAIA load_cnv documentation (it must be an integer in the range 0..(K-1) where K is the number of the considered aberrations), in the passed segmentation_matrix Copy Number can be 0 (loss) or 1 (gain).”

tiagochst avatar Sep 24 '18 16:09 tiagochst