CellNet icon indicating copy to clipboard operation
CellNet copied to clipboard

cn_salmon Error:arguments implying differing number of rows

Open dliu201304 opened this issue 7 years ago • 8 comments

Hi, I'm running CellNet following the Nature Protocols paper, after downloading the example data from srp059670, I tried cn_salmon(stQuery) command, however error occured as followed: determining read length. Trimming reads. Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 21, 22 Then I tried to read the function of cn_salmon in your code, however I still didn't fix the bug, it seems to me that the program is reporting error since the number of the samples in the sample table is incoherent with the files in the working directory, which is not true in my circumstances, so what should I do next? Thanks!

dliu201304 avatar Oct 07 '18 02:10 dliu201304

Hi,

Thanks for your interest in using CellNet. Are you running this on AWS? Which branch are you using? Can you share the precise steps have taken up to the error?

pcahan1 avatar Oct 08 '18 09:10 pcahan1

Hi, I am running on my own MacBook Pro. The code is listed below as: library(devtools), install_github("pcahan1/CellNet", ref = "rpackage") library(CellNet) setwd("./Bioinformatics/CellNet") cn_setup(local = TRUE) iFileMouse <- "salmon.index.mouse.122116.tgz" fetchIndexHandler(destination = "ref/", species = "mouse", iFile=iFileMouse) download.file("https://s3.amazonaws.com/CellNet/rna_seq/mouse/examples/SRP059670/st_SRP059670_example.rda", "st_SRP059670_example.rda") stQuery <- utils_loadObject("st_SRP059670_example.rda") stQuery <- cn_s3_fetchFastq("CellNet","rna_seq/mouse/examples/SRP059670",stuQuery,fname="fname", compressed="gz"), pathToSalmon <- "/Users/danliu/miniconda2/pkgs/salmon-0.7.2-0/bin" expList <- cn_salmon(stQuery, refDir = "ref/", salmonIndex = iFileMouse, fname<-paste0("expList_SRP059670_example.rda"), salmonPath = pathToSalmon)

Strange enough, after downloading all the fastq files, R program itself would not decompress the fastq files, so I gzip all the fastq files in the console, and then run the cn_salmon() function, which return the error metioned above, implying wrong number of file numbers. And then I tried to remove one of the fastq.gz files under the working directory, and rerun the cn_salmon() function, which return the error saying differing number of rows: 21, 20 Wish I had clarified my problem for you.

dliu201304 avatar Oct 09 '18 13:10 dliu201304

Thanks for providing that information. I'll look into this...

pcahan1 avatar Oct 09 '18 13:10 pcahan1

Can CellNet deal with pseudo counts by the Salmon rather than fastq files?

dliu201304 avatar Oct 16 '18 03:10 dliu201304

Hi,

Can you try running with out latest version (https://github.com/pcahan1/CellNet/tree/v0.2.2) instead of the one listed in the protocols:

install_github("pcahan1/CellNet", ref = "v0.2.2")

pcahan1 avatar Oct 17 '18 02:10 pcahan1

I am running into the same problem. I am running CellNet on my university's cluster. I believe the issue is the fastq_trim function, specifically in the line that specifies nnames.

nnames <- paste(unlist(strsplit(fnames, ".fastq")), "_trimmed.fq", sep = "")

The files i am working with are gzipped. it appears that since the files I have end in .gz, the strsplit function splits each file name in the middle, after fastq, instead of splitting the fnames string into file names after fastq.gz. The result is that nnames ends up with an additional row, causing the error.

I unzipped my files and ran cn_salmon and did not get the error.

khajdarovic avatar Dec 17 '18 23:12 khajdarovic

I also got the error.Can you tell me how to solve this problem?Very thanks.The erro as below: awk: cmd. line:1: (FILENAME=- FNR=2) warning: Invalid multibyte data detected. There may be a mismatch between your data and your locale. Trimming reads Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 9, 10

gloriafight avatar May 02 '20 12:05 gloriafight

We have now created a web application that takes as input an expression matrix (counts, TPM, or FPKM), and sample meta-data, and performs CellNet analysis. Additionally, this tool includes analysis of many state-of-the-art differentiation protocols, so that you can benchmark your results against those commonly used methods:

https://cahanlab.org/resources/agnosticCellNet_web/

pcahan1 avatar Nov 18 '21 18:11 pcahan1