蛋白质组学:使用MSnbase创建MSnSet类文件

时间:2018-05-29 18:27:33

标签: r bioinformatics bioconductor genome protein-database

我想创建一个MSset文件(蛋白质组学数据,数据对应光谱计数),但我收到错误消息,我被困住(阅读手册,帮助,论坛等)。

您可以在此处获取我的文件: https://www.dropbox.com/sh/dw7zfgiku6cteba/AADP3U2yxB5LgXy5ykJYFf0ga?dl=0

以下是我尝试的代码:

setwd("~/Desktop/analyse")

## The spectral counts data:

data <- as.character(read.delim("sc.txt", header=TRUE,sep="\t", row.names=1, as.is=TRUE))

## Feature meta-data:
fdata <- as.character(read.delim("fdata.txt", header=TRUE,sep="\t", row.names=1, as.is=TRUE))

## Pheno data:
pdata <- as.character(read.delim("pheno.txt", header=TRUE,sep="\t", row.names=1, as.is=TRUE)) 


library("MSnbase")

readMSnSet(exprsFile = data,
                  phenoDataFile = pdata,
                  featureDataFile = fdata,
                  header=TRUE)

最后一个命令返回错误消息

Error in file(file, "rt") : invalid 'description' argument

我已经验证了以下内容:

class(data)
[1] "character"
class(fdata)
[1] "character"
class(pdata)
[1] "character"

dim(data)
NULL

dim(fdata) 
NULL

dim(pdata) 
NULL

str(data)
 chr [1:15] "c(4, 6, 11, 4, 3, 6, 2, 9, 8, 14, 15, 2, 8, 16, 5, 0, 0, 0, 0, 1, 2, 0, 0, 2, 1, 1, 0, 13, 11, 5, 0, 4, 6, 116,"| __truncated__ ...

str(pdata)
 chr [1:2] "c(\"treatmentA\", \"treatmentA\", \"treatmentA\", \"treatmentA\", \"treatmentA\", \"treatmentB\", \"treatmentB\"| __truncated__ ...

str(fdata)
chr [1:9] "c(222, 273.06, 335.8638, 413.112474, 508.128343, 624.9978619, 768.7473702, 945.5592653, 1163.037896, 1430.53661"| __truncated__ ...

我也尝试过使用“as.matrix()”代替“as.character()”来代替“数据”,并使用“as.data.frame()”代替“fdata”和“pdata” 。

尺寸匹配正确,在这种情况下不是“NULL”但它没有解决问题,因为我收到以下消息:

Error in (function (file, header = FALSE, sep = "", quote = "\"'", dec = ".",      : 
  'file' must be a character string or connection

如果我尝试:

all(rownames(pdata)==colnames(data))
   TRUE

我尝试使用以下内容创建我的MSnSet文件(初始读取为.character ...):

MSnSet(data, fdata, pdata)
Error in (function (storage.mode = c("lockedEnvironment", "environment",  : 
  'AssayData' elements with invalid dimensions: 'exprs'

如果我为“数据”读取文件“as.matrix”,为“fdata”和“pdata”读取“as.data.frame”:

> MSnSet(data, fdata, pdata)
Error in validObject(.Object) : 
  invalid class “MSnSet” object: 1: feature numbers differ between assayData     and featureData
invalid class “MSnSet” object: 2: featureNames differ between assayData and     featureData

> row.names(pdata)[1:5]
[1] "sample1" "sample2" "sample3" "sample4" "sample5"
> colnames(data)[1:5]
[1] "sample1" "sample2" "sample3" "sample4" "sample5"
> row.names(data)[1:5]
[1] "prot1" "prot2" "prot3" "prot4" "prot5"
> row.names(fdata)[1:5]
[1] "prot1" "prot2" "prot3" "prot4" "prot5"

所以我不知道问题的来源。关于如何正确创建我的MSnSet文件的任何想法??

非常感谢您的帮助。

SkyR

1 个答案:

答案 0 :(得分:3)

从参数名称和帮助页面,phenoDataFile应该是文件的路径,而不是文件的内容。从你的问题来看,我猜这个论点应该是phenoDataFile = "~/Desktop/analyse/pheno.txt"