如何基于重复的行将多个值压缩为一个数据框内的一个列表?

时间:2019-05-30 02:08:03

标签: r dataframe

我有一组与特定疾病相关的基因,以及这些基因的表达值。但是,数据框中有很多重复项,这使我产生了这样的错觉,即我拥有的基因比实际更多。

是否有一种方法可以将重复的值压缩到数据框内的列表中?例如,我有:

expression <- c(2,3,1,4,4,3,3)
disease <-c("dis_A", "dis_B", "dis_C", "dis_D", "dis_D.1", "dis_E", "dis_E.1")
cbind(gene, expression, disease)

     gene expression disease  
[1,] "A"  "2"        "dis_A"  
[2,] "B"  "3"        "dis_B"  
[3,] "C"  "1"        "dis_C"  
[4,] "D"  "4"        "dis_D"  
[5,] "D"  "4"        "dis_D.1"
[6,] "E"  "3"        "dis_E"  
[7,] "E"  "3"        "dis_E.1"

并且我想要类似的东西:

[1,] "A"  "2"        "dis_A"  
[2,] "B"  "3"        "dis_B"  
[3,] "C"  "1"        "dis_C"  
[4,] "D"  "4"        "dis_D, dis_D.1"  
[5,] "E"  "3"        "dis_E, dis_E.1"

有没有办法做到这一点?我知道aggregate存在,但是我不确定是否/如何将其应用于这种情况。谢谢!

0 个答案:

没有答案