使用group by执行回归

时间:2017-07-01 08:23:48

标签: r data.table

我想通过对分类变量(Item_ID)进行分组来构建回归模型

我尝试了以下操作,但是我收到以下错误:

a <- train[,predict(lm(Number_Of_Sales ~ Year + Month + Day + Weekday + day_of_year  + Category_3  + Category_2  + Category_1 + Weeknum), test[.BY]), by = Item_ID]

b <- test[,predict(lm(Number_Of_Sales ~ Year + Month + Day + Weekday + day_of_year  + Category_3  + Category_2  + Category_1 + Weeknum, data = train[.BY]), newdata=.SD),by = Item_ID]
Error in `[.data.frame`(test, , predict(lm(Number_Of_Sales ~ Year + Month +  : 
  unused argument (by = Item_ID)
测试和训练数据集中都存在

Item_ID。我尝试使用train$Item_ID,但这也无效。你能帮忙吗?

*****更新了问题以重现错误****

train <- data.frame(state=rep(c('MA', 'NY'), c(10, 10)),
                year=rep(1:10, 2),
                response=c(rnorm(10), rnorm(10)))


test <- data.frame(state=rep(c('MA', 'NY'), c(5, 5)),
                    year=rep(1:5, 2),
                    response=c(rnorm(5), rnorm(5)))


a <- train[,predict(lm(response ~ Year), test[.BY]), by = state]

收到错误:

Error in `[.data.frame`(train, , predict(lm(response ~ Year), test[.BY]),  : 
  unused argument (by = state)

0 个答案:

没有答案