我有一个数据表,其中包含三个变量:
下面是一个简单的示例:
stocks <- data.frame(
hours = c(0,0,0,0,0,0),
mins = c(10,10,10,20,20,30),
x = c(2,4,4,5,3,4)
)
输出:
基于此表,我想根据小时和分钟添加更多列。看起来如下所示:
0_10 0_20 0_30
2 5 4
4 3
4
我尝试使用dcast功能,但最终表只是计算X的频率:(
library(data.table)
dcast(setDT(stocks), x ~ hours+mins, value.var = c("x"))
#Aggregate function missing, defaulting to 'length'
x 0_10 0_20 0_30
1: 2 1 0 0
2: 3 0 1 0
3: 4 2 0 1
4: 5 0 1 0
任何建议?
谢谢!
答案 0 :(得分:2)
我们需要更改dcast
library(data.table)#1.9.7+
dcast(setDT(stocks), rowid(hours, mins)~hours+mins, value.var = "x")[, hours := NULL][]
# 0_10 0_20 0_30
#1: 2 5 4
#2: 4 3 NA
#3: 4 NA NA
版本&lt; 1.9.7,我们创建按照&#39;小时&#39;分钟&#39;分组的序列变量,然后执行dcast
setDT(stocks)[, Seq := 1:.N, by = .(hours, mins)]
dcast(stocks, Seq~hours + mins, value.var = "x")[, Seq := NULL][]