数据表列合并前置数据

时间:2017-09-07 21:59:25

标签: r merge data.table

我的data.table看起来像这样:

   ID age gender relationship   ACESscore PAPre PAPost NAPre NAPost PADelta NADelta
   3 6192  32      2            2         2     8     10    NA      3       2      NA
   4 6191  31      1            1         0     8     10     4      2       2      -2
   6 8421  25      1            2         0     9      9     3      5       0       2
   7 9991  18      1           NA        10     7      9     2      3       2       1
   8 9992  18      2           NA         5     8      8     4      2       0      -2
   9 7612  35      2            1         1     4      7     5      3       3      -2

我想制作PA / Pre-Post和NA / Pre-Post的折线图,我认为最好的方法(如果我错了,请纠正我)是获得一个新的表格喜欢:

 ID age gender relationship   ACESscore  PA         NA    PREPOST
   3 6192  32      2            2         2         10        1
   4 6191  31      1            1         0         10        1
   6 8421  25      1            2         0          9        1
   7 9991  18      1           NA        10          9        1
   8 9992  18      2           NA         5          8        1
   9 7612  35      2            1         1          7        1
   10 6192  32      2            2        8         NA        2
   11 6191  31      1            1        8         4         2
   12 8421  25      1            2        9          3        2
   13 9991  18      1           NA        7          2        2
   14 9992  18      2           NA        8          4        2
   15 7612  35      2            1        4          5        2

如何制作它以便现在有两行ID和PAPre与PAPost堆叠,两个NAPre / Post相同?

1 个答案:

答案 0 :(得分:2)

您可以通过melt重新整形来完成此操作。

melt(dat[, -c("PADelta", "NADelta")],
     measure.vars=list(c("PAPre", "PAPost"), c("NAPre", "NAPost")),
     value.name=c("PAVal", "NAVal"), variable.name="prepost")

dat[, -c("PADelta", "NADelta")]删除delta变量。要折叠的变量放在measure.vars参数的列表中。最后两个参数为新创建的变量提供名称。

返回

      ID age gender relationship ACESscore prepost PAVal NAVal
 1: 6192  32      2            2         2       1     8    NA
 2: 6191  31      1            1         0       1     8     4
 3: 8421  25      1            2         0       1     9     3
 4: 9991  18      1           NA        10       1     7     2
 5: 9992  18      2           NA         5       1     8     4
 6: 7612  35      2            1         1       1     4     5
 7: 6192  32      2            2         2       2    10     3
 8: 6191  31      1            1         0       2    10     2
 9: 8421  25      1            2         0       2     9     5
10: 9991  18      1           NA        10       2     9     3
11: 9992  18      2           NA         5       2     8     2
12: 7612  35      2            1         1       2     7     3

注意:初始帖子使用dat[, .SD, .SDcols=-c("PADelta", "NADelta")]来对变量进行子集化。在评论中,弗兰克警告我,dat[, -c("PADelta", "NADelta")]可以更简洁地完成这一点。

Frank还指出,data.table patterns函数可用于查找与某些模式匹配的变量名称,以匹配要折叠的变量名称。这是一个使用此函数的更简洁和可扩展(想象超过2个句点)的方法。

melt(dat[, -c("PADelta", "NADelta")],
     measure.vars=patterns("^PA", "^NA"),
     value.name=c("PAVal", "NAVal"), variable.name="prepost")

数据

dat <-
structure(list(ID = c(6192L, 6191L, 8421L, 9991L, 9992L, 7612L
), age = c(32L, 31L, 25L, 18L, 18L, 35L), gender = c(2L, 1L, 
1L, 1L, 2L, 2L), relationship = c(2L, 1L, 2L, NA, NA, 1L), ACESscore = c(2L, 
0L, 0L, 10L, 5L, 1L), PAPre = c(8L, 8L, 9L, 7L, 8L, 4L), PAPost = c(10L, 
10L, 9L, 9L, 8L, 7L), NAPre = c(NA, 4L, 3L, 2L, 4L, 5L), NAPost = c(3L, 
2L, 5L, 3L, 2L, 3L), PADelta = c(2L, 2L, 0L, 2L, 0L, 3L), NADelta = c(NA, 
-2L, 2L, 1L, -2L, -2L)), .Names = c("ID", "age", "gender", "relationship", 
"ACESscore", "PAPre", "PAPost", "NAPre", "NAPost", "PADelta", 
"NADelta"), row.names = c(NA, -6L), class = c("data.table", "data.frame"))