cn_salmon Error:arguments implying differing number of rows
Hi, I'm running CellNet following the Nature Protocols paper, after downloading the example data from srp059670, I tried cn_salmon(stQuery) command, however error occured as followed: determining read length. Trimming reads. Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 21, 22 Then I tried to read the function of cn_salmon in your code, however I still didn't fix the bug, it seems to me that the program is reporting error since the number of the samples in the sample table is incoherent with the files in the working directory, which is not true in my circumstances, so what should I do next? Thanks!
Hi,
Thanks for your interest in using CellNet. Are you running this on AWS? Which branch are you using? Can you share the precise steps have taken up to the error?
Hi, I am running on my own MacBook Pro. The code is listed below as: library(devtools), install_github("pcahan1/CellNet", ref = "rpackage") library(CellNet) setwd("./Bioinformatics/CellNet") cn_setup(local = TRUE) iFileMouse <- "salmon.index.mouse.122116.tgz" fetchIndexHandler(destination = "ref/", species = "mouse", iFile=iFileMouse) download.file("https://s3.amazonaws.com/CellNet/rna_seq/mouse/examples/SRP059670/st_SRP059670_example.rda", "st_SRP059670_example.rda") stQuery <- utils_loadObject("st_SRP059670_example.rda") stQuery <- cn_s3_fetchFastq("CellNet","rna_seq/mouse/examples/SRP059670",stuQuery,fname="fname", compressed="gz"), pathToSalmon <- "/Users/danliu/miniconda2/pkgs/salmon-0.7.2-0/bin" expList <- cn_salmon(stQuery, refDir = "ref/", salmonIndex = iFileMouse, fname<-paste0("expList_SRP059670_example.rda"), salmonPath = pathToSalmon)
Strange enough, after downloading all the fastq files, R program itself would not decompress the fastq files, so I gzip all the fastq files in the console, and then run the cn_salmon() function, which return the error metioned above, implying wrong number of file numbers. And then I tried to remove one of the fastq.gz files under the working directory, and rerun the cn_salmon() function, which return the error saying differing number of rows: 21, 20 Wish I had clarified my problem for you.
Thanks for providing that information. I'll look into this...
Can CellNet deal with pseudo counts by the Salmon rather than fastq files?
Hi,
Can you try running with out latest version (https://github.com/pcahan1/CellNet/tree/v0.2.2) instead of the one listed in the protocols:
install_github("pcahan1/CellNet", ref = "v0.2.2")
I am running into the same problem. I am running CellNet on my university's cluster. I believe the issue is the fastq_trim function, specifically in the line that specifies nnames.
nnames <- paste(unlist(strsplit(fnames, ".fastq")), "_trimmed.fq", sep = "")
The files i am working with are gzipped. it appears that since the files I have end in .gz, the strsplit function splits each file name in the middle, after fastq, instead of splitting the fnames string into file names after fastq.gz. The result is that nnames ends up with an additional row, causing the error.
I unzipped my files and ran cn_salmon and did not get the error.
I also got the error.Can you tell me how to solve this problem?Very thanks.The erro as below: awk: cmd. line:1: (FILENAME=- FNR=2) warning: Invalid multibyte data detected. There may be a mismatch between your data and your locale. Trimming reads Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 9, 10
We have now created a web application that takes as input an expression matrix (counts, TPM, or FPKM), and sample meta-data, and performs CellNet analysis. Additionally, this tool includes analysis of many state-of-the-art differentiation protocols, so that you can benchmark your results against those commonly used methods:
https://cahanlab.org/resources/agnosticCellNet_web/