Question

我在处理趋势时陷入困境。我的问题类似于下面的线程，但我有一个额外的变量叫＆＃39; item＆＃39;。

How to determine trend of time-series of values in R

我的最终结果如下面的样本。请帮忙

Customer_ID Item    Sales_Slope  
Josh        milk      Positive
Josh         eggs      Negative
Eric         milk      Mixed
Eric         eggs      postive

我的数据：

require("data.table")
dat <- data.table(
            customer_ID=c(rep("Josh",6),rep("Ray",7),rep("Eric",7)),
            item=c(rep("milk",3),rep("eggs",3),rep("milk",4),rep("eggs",3),rep("milk",3),rep("eggs",4)),
            sales=c(35,50,65,65,52,49,15,10,13,9,35,50,65,65,52,49,15,10,13,9))

dat[,transaction_num:=seq(1,.N), by=c("customer_ID")]

Answer 1

我同意@smci，从该链接改变的是“by”变量增加了。我希望这个解决方案能说清楚

> library(plyr)
> abc <- function(x){
   if(all(diff(x$sales)>0)) return('Positive')
   if(all(diff(x$sales)<0)) return('Negative')
   return('Mixed')
  }

 y= ddply(dat, .(customer_ID, item), abc)
 y
  customer_ID item       V1
1        Eric eggs    Mixed
2        Eric milk Negative
3        Josh eggs Negative
4        Josh milk Positive
5         Ray eggs Positive
6         Ray milk    Mixed

Answer 2

我概述的data.table方法是：

require(data.table)

trend <- function(x) {
   ifelse(all(diff(x)>0), 'Positive',
   ifelse(all(diff(x)<0), 'Negative', 'Mixed'))
}

dat[, trend(sales), by=c("customer_ID","item")]
   customer_ID item       V1
1:        Josh milk Positive
2:        Josh eggs Negative
3:         Ray milk    Mixed
4:         Ray eggs Positive
5:        Eric milk Negative
6:        Eric eggs    Mixed

# or if you want to assign the result...
dat[, Sales_Slope:=trend(sales), by=c("customer_ID","item")]

使用多个变量（如客户ID /项目等）确定销售趋势

2 个答案: