r:部分调整为宽表,但保留关键列

时间:2019-04-23 14:39:58

标签: r dataframe data-structures reshape

我想通过散布Lag来将当前数据帧转换成一个宽表,还要保留变量agent。宽表的大多数单元格中的数字为sales

library(reshape2)
set.seed(123)

day = rep(seq(as.Date('2019/01/01'), as.Date('2019/01/04'), by="day"), each = 5)
agent = sample(c('A', 'B', 'C'), 20, replace = T)
sales = rnorm(20, 100, 30) 
Lag = sample(0:3, 20, replace=T)

dt = data.frame(day, sales, agent, Lag)

理想情况下,结果如下:

enter image description here

我尝试了以下方法,但这些方法均无效。

dcast(dt, day~Lag, value.var='sales')
dcast(dt, day~Lag+agent, value.var='sales')

任何建议都值得赞赏!

2 个答案:

答案 0 :(得分:2)

这里是一种选择:

ignore

注意:软件包library(reshape2) dcast(dt, day + agent ~ paste0("lag_", Lag), value.var='sales', fun.aggregate = sum) # day agent lag_0 lag_1 lag_2 lag_3 # 1 2019-01-01 A 0.00000 0.00000 136.72245 0.00000 # 2 2019-01-01 B 0.00000 112.02314 0.00000 0.00000 # 3 2019-01-01 C 110.79441 103.32048 0.00000 83.32477 # 4 2019-01-02 A 0.00000 153.60739 0.00000 0.00000 # 5 2019-01-02 B 0.00000 85.81626 0.00000 235.97619 # 6 2019-01-02 C 0.00000 0.00000 0.00000 41.00149 # 7 2019-01-03 A 0.00000 81.24882 0.00000 0.00000 # 8 2019-01-03 B 78.13326 0.00000 93.46075 0.00000 # 9 2019-01-03 C 0.00000 0.00000 69.21987 67.96529 # 10 2019-01-04 A 0.00000 190.98950 104.60119 0.00000 # 11 2019-01-04 C 187.01365 0.00000 0.00000 0.00000 已停止使用和维护。因此,建议改用reshape2或其他替代方法,例如data.table::dcast()

答案 1 :(得分:1)

这是dplyr / tidyr的替代方法。使用spread中的tidyr可以生成所需的表单:

library(tidyr)
dt %>% spread(Lag, unique(Lag))

然后使用dplyr相应地填充列:

dt %>% spread(Lag, unique(Lag), fill = 0) %>% mutate(`0` = sales * `0`) %>% mutate(`1` = sales * `1`) %>% mutate(`2` = sales * `2`/2) %>% mutate(`3` = sales * `3`/3)

          day     sales agent 0         1         2         3
1  2019-01-01  83.32477     C 0   0.00000   0.00000  83.32477
2  2019-01-01 103.32048     C 0 103.32048   0.00000   0.00000
3  2019-01-01 110.79441     C 0   0.00000   0.00000   0.00000
4  2019-01-01 112.02314     B 0 112.02314   0.00000   0.00000
5  2019-01-01 136.72245     A 0   0.00000 136.72245   0.00000
6  2019-01-02  41.00149     C 0   0.00000   0.00000  41.00149
7  2019-01-02  85.81626     B 0  85.81626   0.00000   0.00000
8  2019-01-02 114.93551     B 0   0.00000   0.00000 114.93551
9  2019-01-02 121.04068     B 0   0.00000   0.00000 121.04068
10 2019-01-02 153.60739     A 0 153.60739   0.00000   0.00000