仅保留矩阵

时间:2015-10-09 05:17:54

标签: r matrix bioinformatics

我有一个看起来像这样的大矩阵,有更多的列和行

---    CellName1 CellName2 CellName3
Gene A    1           2        3
Gene B    4           5        6
Gene C    7           8        9 
Gene D    10          11       12

此矩阵可扩展数千列和行。我只需要大约13列电子表格,我在.txt文件中列出的标题

仅保留具有特定标题的列(文本文件中的列)并保留参考列(Gene A,Gene B等)的最佳方法是什么?

我尝试使用所需列的名称来创建一个字符串数组,但是我遇到了意外的符号错误:

rsem.genes.tpm <- read.delim("~/Desktop/Desktop/rsem.genes.tpm.matrix", header=FALSE)
View(rsem.genes.tpm)
library("cluster", lib.loc="/Library/Frameworks/R.framework/Versions/3.2/Resources/library")

wanted<-c("cy82-CD45-pos-2-C04-S508-comb”, "Cy74_CD45_A06_S390_comb”, “Cy74_CD45_C07_S415_comb”, “Cy74_CD45_H08_S476_comb”, “cy53-1-CD45-pos-1-A03-S3-comb”, “cy53-1-CD45-pos-2-A10-S970-comb”, “cy53-1-CD45-pos-2-B03-S975-comb”, “cy53-1-CD45-pos-2-B09-S981-comb”, “cy53-1-CD45-pos-2-B12-S984-comb”, “cy53-1-CD45-pos-2-C10-S994-comb”, “cy53-1-CD45-pos-2-C12-S996-comb”, “cy53-1-CD45-pos-2-D09-S1005-comb”, “cy53-1-CD45-pos-2-D10-S1006-comb”, “cy53-1-CD45-pos-2-H05-S1049-comb”, “cy58-1-CD45-pos-A09-S585-comb”)  # Concatenates character strings into a vector
Error: unexpected symbol in "wanted<-c("cy82-CD45-pos-2-C04-S508-comb”, "Cy74_CD45_A06_S390_comb"

我想我想使用keep和drop函数的某种组合,但我不确定如何实现它们

1 个答案:

答案 0 :(得分:0)

让我们创建一些示例数据:

set.seed <- 123
df <- as.data.frame(matrix(sample(1:10,25,replace = TRUE),nrow = 5))
df$gene <- c(paste0('gene',LETTERS[1:5]))

假设您只想保留列:'V1','V2'和'V5':

to.keep <- c('V1','V2','V5')
> df[,c('gene',to.keep)]
    gene V1 V2 V5
1 Gene A  6  1  5
2 Gene B  9  4  1
3 Gene C  7  7  8
4 Gene D  4 10  1
5 Gene E 10  4  4