按时间和客户ID转置列组

时间:2017-10-31 19:21:04

标签: r

我有一个类似下面的数据集

time    customerID  Material
20170101    1   a
20170101    1   b
20170102    1   d
20170102    1   e
20170102    1   f
20170101    2   s
20170102    2   a
20170102    2   c

我希望改变以下内容:

time    customerID  Material.1  Material.2  Material.2
20170101    1   a   b   
20170102    1   d   e   f
20170101    2   s       
20170102    2   a   c

要生成Sample表,请在R:

中运行它
time <- c(20170101, 20170101, 20170102, 20170102, 20170102, 20170101, 20170102, 20170102)
customerID <- c(1,1,1,1,1,2,2,2)
Material <- c('a','b','d','e','f','s','a','c')
df <- data.frame(time, customerID, Material)

我尝试了重塑,但它没有按照我预期的方式工作。任何关于此的指示都将受到高度赞赏。

2 个答案:

答案 0 :(得分:0)

试试这个:

library(tidyr)
df %>% spread(Material, Material)

输出:

      time customerID    a    b    c    d    e    f    s
1 20170101          1    a    b <NA> <NA> <NA> <NA> <NA>
2 20170101          2 <NA> <NA> <NA> <NA> <NA> <NA>    s
3 20170102          1 <NA> <NA> <NA>    d    e    f <NA>
4 20170102          2    a <NA>    c <NA> <NA> <NA> <NA>

答案 1 :(得分:0)

使用dplyrtidyr::spread

library(dplyr)
library(tidyr)
df %>% 
  group_by(time, customerID) %>% 
  mutate(grp_id = paste0("Material.", row_number())) %>% 
  spread(grp_id, Material, fill = "") %>% 
  arrange(customerID)

#> # A tibble: 4 x 5
#> # Groups:   time, customerID [4]
#>       time customerID Material.1 Material.2 Material.3
#>      <int>      <int>      <chr>      <chr>      <chr>
#> 1 20170101          1          a          b           
#> 2 20170102          1          d          e          f
#> 3 20170101          2          s                      
#> 4 20170102          2          a          c