我一直在试验很多R代码,例如melt()
,split()
,reshape()
,但没有一个能够真正解决我的问题。我想基于下面的那个创建一个新的df,它不再有Long_group
和B_lymph_count
,而是有三个不同的列,即" Aubagio_0" ," Aubagio_1"和" Aubagio_2"其中包含特定B_lymph_counts
=患者ID的相应MS.number
的值。
MS.number B_Lymph_count Long_group
13 "MS072/1" " 57014" "Aubagio_0"
14 "MS072/1" "116730" "Aubagio_1"
46 "MS1246/1" "117843" "Aubagio_0"
47 "MS1246/1" "209583" "Aubagio_1"
52 "MS1253/1" " 71434" "Aubagio_0"
53 "MS1253/1" "130382" "Aubagio_1"
100 "MS717/1" " 63916" "Aubagio_0"
101 "MS717/1" " 62434" "Aubagio_1"
102 "MS717/1" " 43533" "Aubagio_2"
MS.number Aubagio_0 Aubagio_1 Aubagio_2
MS717/1 63916 62434 43533
MS1253/1 71434 130382 NA
...
希望这在R中是可能的。 非常感谢你的回复!
答案 0 :(得分:1)
你可以尝试
library(reshape2)
dcast(data = d, MS.number ~ Long_group, value.var = "B_Lymph_count", fill=0)
Aubagio_0 Aubagio_1 Aubagio_2
MS072/1 57014 116730 0
MS1246/1 117843 209583 0
MS1253/1 71434 130382 0
MS717/1 63916 62434 43533
fill
参数指定空单元格中的值。例如,您也可以将其设置为NA
(默认情况下)。
数据
d <- structure(list(MS.number = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 4L), .Label = c("MS072/1", "MS1246/1", "MS1253/1", "MS717/1"
), class = "factor"), B_Lymph_count = c(57014L, 116730L, 117843L,
209583L, 71434L, 130382L, 63916L, 62434L, 43533L), Long_group = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 3L), .Label = c("Aubagio_0", "Aubagio_1",
"Aubagio_2"), class = "factor")), .Names = c("MS.number", "B_Lymph_count",
"Long_group"), class = "data.frame", row.names = c("13", "14",
"46", "47", "52", "53", "100", "101", "102"))