重塑发布前的数据

时间:2019-09-23 05:02:08

标签: r reshape tidyr

我有一个数据集

df <- data.frame("ID" = c("sue_1","bob_2","nick_3","joe_4"),
                 "1_confidence.x" = c(3,3,1,5),
                 "2_reading.x" = c(4,3,2,5),
                 "3_maths.x" = c(3,2,4,2),
                 "1_confidence.y" = c(3,2,3,4),
                 "2_reading.y" = c(3,4,2,1),
                 "3_maths.y" = c(3,4,2,5)
)

提供此df:

> df
ID X1_confidence.x X2_reading.x X3_maths.x X1_confidence.y X2_reading.y X3_maths.y
1  sue_1               3            4          3               3            3          3
2  bob_2               3            3          2               2            4          4
3 nick_3               1            2          4               3            2          2
4  joe_4               5            5          2               4            1          5

我希望它采用这种格式:

      ID Test X1_confidence X2_reading X3_maths
1  sue_1  pre             3          4        3
2  sue_1 post             3          3        3
3  bob_2  pre             3          3        2
4  bob_2 post             2          4        4
5 nick_3  pre             1          2        4
6 nick_3 post             3          2        2
7  joe_4  pre             5          5        2
8  joe_4 post             4          1        5

我已经尝试过重塑和聚集,但似乎无法弄清楚……

2 个答案:

答案 0 :(得分:1)

这应该可以解决问题:

df_long <- reshape(
  data = df,
  varying = list(c("X1_confidence.x","X1_confidence.y"),
                 c("X2_reading.x","X2_reading.y"),
                 c("X3_maths.x","X3_maths.y")),
  idvar = 'ID',
  v.names = c('X1_confidence', 'X2_reading', 'X3_maths'),
  timevar = 'Test',
  times = c('pre', 'post'),
  direction = 'long'
)

然后按ID排序:

df_long <- df_long[order(df_long$ID, decreasing = T), ]

答案 1 :(得分:1)

应该仅使用pivot_longer才有更“直接”的方法。我无法得到正确的论据。这是pivot_longer

中同时使用pivot_widertidyr 1.0.0的一种方法
library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = starts_with("X"), names_to = "key") %>%
  mutate(key = sub("\\.x$|\\.y$", "", key)) %>%
  group_by(ID, key) %>%
  mutate(Test =  c("pre", "post")) %>%
  pivot_wider(c(ID, Test), key)

#  ID     Test  X1_confidence X2_reading X3_maths
#  <fct>  <chr>         <dbl>      <dbl>    <dbl>
#1 sue_1  pre               3          4        3
#2 sue_1  post              3          3        3
#3 bob_2  pre               3          3        2
#4 bob_2  post              2          4        4
#5 nick_3 pre               1          2        4
#6 nick_3 post              3          2        2
#7 joe_4  pre               5          5        2
#8 joe_4  post              4          1        5

如果您的tidyr未更新,则使用gatherspread相同

df %>%
  gather(key, value, -ID) %>%
  mutate(key = sub("\\.x$|\\.y$", "", key)) %>%
  group_by(key) %>%
  mutate(Test =  c("pre", "post")) %>%
  spread(key, value)