我有一个看起来像这样的数据集:
ID Group1 Group2 Group3 Time Var1 Var2
1 A A1 A1.1 1 5 7
1 A A1 A1.1 2 5 7
1 A A1 A1.1 3 5 7
1 A A1 A1.1 4 5 7
1 A A1 A1.1 5 5 7
2 B B1 B1.1 1 5 7
2 B B1 B1.1 2 5 7
2 B B1 B1.1 3 5 7
2 B B1 B1.1 4 5 7
2 B B1 B1.1 5 5 7
3 C C1 C1.1 1 5 7
3 C C1 C1.1 3 5 7
3 C C1 C1.1 4 5 7
.
.
.
我希望它看起来像:
ID Group1 Group2 Group3 Time1 Time2 Time3...TimeN Time1 Time2 Time3...TimeN
1 A A1 A1.1 5 5 5 5 7 7 7 7
2 B A1 A1.1 5 5 5 5 7 7 7 7
3 C A1 A1.1 5 5 5 5 7 7 7 7
当然,并非所有Var1的值都是5,也不是Var2的所有值都是7-例如所示。
我想对数据进行转换,使得每一行对于一个ID都是唯一的,而Var1和Var2成为具有根据ID填充行的列
我尝试过
rave_subset <- reshape2::dcast(rave_subset, Group1 +
Group2 +
Group3 ~ Var1 + Var2, value.var = c("Var1", "Var2"))
但是这会将Var1和Var2中的值串在一起,以便它们显示为“ Var1_Var2”
如果我这样做:
rave_subset <- reshape2::dcast(rave_subset, Group1 +
Group2 +
Group3 ~ Var1, value.var = "Var1")
然后我除了失去Var2之外几乎得到了我想要的东西
答案 0 :(得分:0)
我最终进行了dcast
和merge
的多次迭代-可以工作,但不是最优雅的。
代码如下:
rave_subset <- subset(RAVE, !is.na(RAVE$Days) & RAVE$Days <= 180, select = c("Participant ID",
"Days",
"Randomized Treatment Group",
"AAV Type",
"ANCA Status - PR3 or MPO",
"BVAS",
"Glucocorticoid Dose (mg)",
"Cum. Pred Dose Since Last Visit (mg)",
"baseline_BVAS"))
rave_subset1 <- reshape2::dcast(rave_subset, `Participant ID` +
`Randomized Treatment Group` +
`AAV Type` +
`ANCA Status - PR3 or MPO` +
baseline_BVAS ~ Days, value.var = "BVAS")
rave_subset2 <- reshape2::dcast(rave_subset, `Participant ID` ~ Days , value.var = "Glucocorticoid Dose (mg)")
rave_subset3 <- reshape2::dcast(rave_subset, `Participant ID` ~ Days , value.var = "Cum. Pred Dose Since Last Visit (mg)")
rave_subset <- merge(x=rave_subset1,
y=rave_subset2,
by="Participant ID")
drops <- c("Randomized Treatment Group.y", "AAV Type.y", "ANCA Status - PR3 or MPO.y", "baseline_BVAS.y")
rave_subset <- rave_subset[,!(names(rave_subset) %in% drops)]
rave_subset <- merge(x=rave_subset,
y=rave_subset3,
by="Participant ID")