这个问题建立在我之前提出的另一个问题上。鉴于以下MWE:
test <- as.data.table(data.frame(event_id = c("A","B","A","A","B"),
income = c(1,2,3,4,5),
location = c("PlaceX","PlaceY","PlaceX","PlaceX","PlaceY")))
test
event_id income location
1: A 1 PlaceX
2: B 2 PlaceY
3: A 3 PlaceX
4: A 4 PlaceX
5: B 5 PlaceY
我将如何获得:
event_id mean_inc loc_PlaceX loc_PlaceY
(fctr) (fctr) (numeric) (numeric)
1 A 2.666667 3 0
2 B 3.500000 0 2
到目前为止我所拥有的:
test %>%
group_by(event_id, location) %>%
summarise(mean_inc = mean(income))
Source: local data table [2 x 3]
Groups: event_id
event_id location mean_inc
(fctr) (fctr) (dbl)
1 A PlaceX 2.666667
2 B PlaceY 3.500000
请注意我有大约10个列,我必须要分解,就像我尝试使用上面的location
列一样。此外,还有数百万行。
答案 0 :(得分:0)
由于OP显示data.table
,可以使用data.table
方法
test[, mean_inc := mean(income), event_id]
dcast(test, event_id+mean_inc~location, value.var="income", length)
# event_id mean_inc PlaceX PlaceY
#1: A 2.666667 3 0
#2: B 3.500000 0 2