我在处理趋势时陷入困境。我的问题类似于下面的线程,但我有一个额外的变量叫' item'。
How to determine trend of time-series of values in R
我的最终结果如下面的样本。请帮忙
Customer_ID Item Sales_Slope
Josh milk Positive
Josh eggs Negative
Eric milk Mixed
Eric eggs postive
我的数据:
require("data.table")
dat <- data.table(
customer_ID=c(rep("Josh",6),rep("Ray",7),rep("Eric",7)),
item=c(rep("milk",3),rep("eggs",3),rep("milk",4),rep("eggs",3),rep("milk",3),rep("eggs",4)),
sales=c(35,50,65,65,52,49,15,10,13,9,35,50,65,65,52,49,15,10,13,9))
dat[,transaction_num:=seq(1,.N), by=c("customer_ID")]
答案 0 :(得分:1)
我同意@smci,从该链接改变的是“by”变量增加了。我希望这个解决方案能说清楚
> library(plyr)
> abc <- function(x){
if(all(diff(x$sales)>0)) return('Positive')
if(all(diff(x$sales)<0)) return('Negative')
return('Mixed')
}
y= ddply(dat, .(customer_ID, item), abc)
y
customer_ID item V1
1 Eric eggs Mixed
2 Eric milk Negative
3 Josh eggs Negative
4 Josh milk Positive
5 Ray eggs Positive
6 Ray milk Mixed
答案 1 :(得分:1)
我概述的data.table方法是:
require(data.table)
trend <- function(x) {
ifelse(all(diff(x)>0), 'Positive',
ifelse(all(diff(x)<0), 'Negative', 'Mixed'))
}
dat[, trend(sales), by=c("customer_ID","item")]
customer_ID item V1
1: Josh milk Positive
2: Josh eggs Negative
3: Ray milk Mixed
4: Ray eggs Positive
5: Eric milk Negative
6: Eric eggs Mixed
# or if you want to assign the result...
dat[, Sales_Slope:=trend(sales), by=c("customer_ID","item")]