我尝试做
library(data.table)
customers <- data.table( id=c(1,2,3,4), # id of customers
name=c("Frank","Hans","Peter","Markus"),
age=c(22,34,64,19))
purchases <- data.table( id=c(1,2,3,4,5,6), # id of purchases
customer_id=c(1,2,4,2,1,5),
name=c("CD","hairdryer","book","glas","paper","chair"),
product_type=c("home","home","home","home","office","office")
)
purchases[customers, .N, by=product_type, on=c(customer_id = "id")]
purchases[customers, mean(age), by=product_type, on=c(customer_id = "id")]
并期望第一次加入
product_type N
1: home 4
2: NA 1
3: office 1
但得到了
product_type N
1: home 4
2: NA 1
3: home 1
第二个命令应该进行连接,然后在product_type
上进行分组并计算每个组的年龄平均值。但是我收到age
未找到的错误。
purchases[customers, mean(age), by=.EACHI, on=c(customer_id = "id")]
顺便说一下,工作得非常出色。