按组转置两个变量

时间:2019-02-07 21:48:00

标签: r transform

我有一个看起来像这样的数据集:

ID Group1 Group2 Group3 Time Var1 Var2
1    A      A1     A1.1   1    5   7
1    A      A1     A1.1   2    5   7
1    A      A1     A1.1   3    5   7
1    A      A1     A1.1   4    5   7
1    A      A1     A1.1   5    5   7
2    B      B1     B1.1   1    5   7
2    B      B1     B1.1   2    5   7
2    B      B1     B1.1   3    5   7
2    B      B1     B1.1   4    5   7
2    B      B1     B1.1   5    5   7
3    C      C1     C1.1   1    5   7
3    C      C1     C1.1   3    5   7
3    C      C1     C1.1   4    5   7
.
.
.

我希望它看起来像:

ID Group1 Group2 Group3 Time1 Time2 Time3...TimeN Time1 Time2 Time3...TimeN
1    A      A1     A1.1   5     5      5      5      7     7     7      7
2    B      A1     A1.1   5     5      5      5      7     7     7      7
3    C      A1     A1.1   5     5      5      5      7     7     7      7

当然,并非所有Var1的值都是5,也不是Var2的所有值都是7-例如所示。

我想对数据进行转换,使得每一行对于一个ID都是唯一的,而Var1和Var2成为具有根据ID填充行的列

我尝试过

rave_subset <- reshape2::dcast(rave_subset, Group1 +
                          Group2 +
                          Group3 ~ Var1 + Var2, value.var = c("Var1", "Var2"))

但是这会将Var1和Var2中的值串在一起,以便它们显示为“ Var1_Var2”

如果我这样做:

rave_subset <- reshape2::dcast(rave_subset, Group1 +
                          Group2 +
                          Group3 ~ Var1, value.var = "Var1")

然后我除了失去Var2之外几乎得到了我想要的东西

1 个答案:

答案 0 :(得分:0)

我最终进行了dcastmerge的多次迭代-可以工作,但不是最优雅的。

代码如下:

rave_subset <- subset(RAVE, !is.na(RAVE$Days) & RAVE$Days <= 180, select = c("Participant ID",
                                                          "Days",
                                                          "Randomized Treatment Group",
                                                          "AAV Type",
                                                          "ANCA Status - PR3 or MPO",
                                                          "BVAS",
                                                          "Glucocorticoid Dose (mg)",
                                                          "Cum. Pred Dose Since Last Visit (mg)",
                                                          "baseline_BVAS"))

rave_subset1 <- reshape2::dcast(rave_subset, `Participant ID` +
                                  `Randomized Treatment Group` +
                                  `AAV Type` +
                                  `ANCA Status - PR3 or MPO` +
                                  baseline_BVAS ~ Days, value.var = "BVAS")

rave_subset2 <- reshape2::dcast(rave_subset, `Participant ID` ~ Days , value.var = "Glucocorticoid Dose (mg)")

rave_subset3 <- reshape2::dcast(rave_subset, `Participant ID` ~ Days , value.var = "Cum. Pred Dose Since Last Visit (mg)")

rave_subset <- merge(x=rave_subset1,
                     y=rave_subset2,
                     by="Participant ID")

drops <- c("Randomized Treatment Group.y", "AAV Type.y", "ANCA Status - PR3 or MPO.y", "baseline_BVAS.y")
rave_subset <- rave_subset[,!(names(rave_subset) %in% drops)]

rave_subset <- merge(x=rave_subset,
                    y=rave_subset3,
                    by="Participant ID")