我正在使用Facebook发布的名为Prophet的新软件包。它做时间序列预测,我想通过组应用此功能。
向下滚动到R部分。
https://facebookincubator.github.io/prophet/docs/quick_start.html
这是我的尝试:
grouped_output = df %>% group_by(group) %>%
do(m = prophet(df[,c(1,3)])) %>%
do(future = make_future_dataframe(m, period = 7)) %>%
do(forecast = prophet:::predict.prophet(m, future))
grouped_output[[1]]
然后我需要从每组的列表中提取我遇到的麻烦。
以下是没有群组的原始数据框:
ds <- as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04',
'2016-11-05','2016-11-06','2016-11-07','2016-11-08',
'2016-11-09','2016-11-10','2016-11-11','2016-11-12',
'2016-11-13','2016-11-14','2016-11-15','2016-11-16',
'2016-11-17','2016-11-18','2016-11-19','2016-11-20',
'2016-11-21','2016-11-22','2016-11-23','2016-11-24',
'2016-11-25','2016-11-26','2016-11-27','2016-11-28',
'2016-11-29','2016-11-30'))
y <- c(15,17,18,19,20,54,67,23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90)
y<-as.numeric(y)
df <- data.frame(ds, y)
df
ds y
1 2016-11-01 15
2 2016-11-02 17
3 2016-11-03 18
4 2016-11-04 19
5 2016-11-05 20
6 2016-11-06 54
7 2016-11-07 67
8 2016-11-08 23
9 2016-11-09 12
10 2016-11-10 34
11 2016-11-11 12
12 2016-11-12 78
13 2016-11-13 34
14 2016-11-14 12
15 2016-11-15 3
16 2016-11-16 45
17 2016-11-17 67
18 2016-11-18 89
19 2016-11-19 12
20 2016-11-20 111
21 2016-11-21 123
22 2016-11-22 112
23 2016-11-23 14
24 2016-11-24 566
25 2016-11-25 345
26 2016-11-26 123
27 2016-11-27 567
28 2016-11-28 56
29 2016-11-29 87
30 2016-11-30 90
当我按照以下方式对单个组进行操作时,当前函数有效:
#install.packages('prophet')
library(prophet)
m<-prophet(df)
future <- make_future_dataframe(m, period = 7)
forecast <- prophet:::predict.prophet(m, future)
forecast$yhat
[1] -2.649032 -29.762095 128.169781 59.573684 -11.623727 107.473617 -29.949730 -42.862455 -62.378408 104.797639 46.868610
[12] -12.502864 119.282058 -4.914921 -4.402638 -10.643570 169.309505 123.321261 74.734746 215.856347 99.290218 105.508059
[23] 102.882915 284.245984 237.401258 185.688202 321.466962 197.451536 194.280518 180.535663 349.304365 288.684031 222.337210
[34] 342.968499 203.648851 185.377165
我现在想要更改它,以便将prophet:::predict
函数应用于每个组。所以新数据框BY GROUP看起来像这样:
ds <- as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04',
'2016-11-05','2016-11-06','2016-11-07','2016-11-08',
'2016-11-09','2016-11-10','2016-11-11','2016-11-12',
'2016-11-13','2016-11-14','2016-11-15','2016-11-16',
'2016-11-17','2016-11-18','2016-11-19','2016-11-20',
'2016-11-21','2016-11-22','2016-11-23','2016-11-24',
'2016-11-25','2016-11-26','2016-11-27','2016-11-28',
'2016-11-29','2016-11-30',
'2016-11-01','2016-11-02','2016-11-03','2016-11-04',
'2016-11-05','2016-11-06','2016-11-07','2016-11-08',
'2016-11-09','2016-11-10','2016-11-11','2016-11-12',
'2016-11-13','2016-11-14','2016-11-15','2016-11-16',
'2016-11-17','2016-11-18','2016-11-19','2016-11-20',
'2016-11-21','2016-11-22','2016-11-23','2016-11-24',
'2016-11-25','2016-11-26','2016-11-27','2016-11-28',
'2016-11-29','2016-11-30'))
y <- c(15,17,18,19,20,54,67,23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90,
45,23,12,10,21,34,12,45,12,44,87,45,32,67,1,57,87,99,33,234,456,123,89,333,411,232,455,55,90,21)
y<-as.numeric(y)
group<-c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
"A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
"B","B","B","B","B","B","B","B","B","B","B","B","B","B","B",
"B","B","B","B","B","B","B","B","B","B","B","B","B","B","B")
df <- data.frame(ds,group, y)
df
ds group y
1 2016-11-01 A 15
2 2016-11-02 A 17
3 2016-11-03 A 18
4 2016-11-04 A 19
5 2016-11-05 A 20
6 2016-11-06 A 54
7 2016-11-07 A 67
8 2016-11-08 A 23
9 2016-11-09 A 12
10 2016-11-10 A 34
11 2016-11-11 A 12
12 2016-11-12 A 78
13 2016-11-13 A 34
14 2016-11-14 A 12
15 2016-11-15 A 3
16 2016-11-16 A 45
17 2016-11-17 A 67
18 2016-11-18 A 89
19 2016-11-19 A 12
20 2016-11-20 A 111
21 2016-11-21 A 123
22 2016-11-22 A 112
23 2016-11-23 A 14
24 2016-11-24 A 566
25 2016-11-25 A 345
26 2016-11-26 A 123
27 2016-11-27 A 567
28 2016-11-28 A 56
29 2016-11-29 A 87
30 2016-11-30 A 90
31 2016-11-01 B 45
32 2016-11-02 B 23
33 2016-11-03 B 12
34 2016-11-04 B 10
35 2016-11-05 B 21
36 2016-11-06 B 34
37 2016-11-07 B 12
38 2016-11-08 B 45
39 2016-11-09 B 12
40 2016-11-10 B 44
41 2016-11-11 B 87
42 2016-11-12 B 45
43 2016-11-13 B 32
44 2016-11-14 B 67
45 2016-11-15 B 1
46 2016-11-16 B 57
47 2016-11-17 B 87
48 2016-11-18 B 99
49 2016-11-19 B 33
50 2016-11-20 B 234
51 2016-11-21 B 456
52 2016-11-22 B 123
53 2016-11-23 B 89
54 2016-11-24 B 333
55 2016-11-25 B 411
56 2016-11-26 B 232
57 2016-11-27 B 455
58 2016-11-28 B 55
59 2016-11-29 B 90
60 2016-11-30 B 21
我如何预测使用prophet
套餐,按群组而不是总计?
答案 0 :(得分:12)
以下是使用tidyr::nest
按组嵌套数据的解决方案,使用purrr::map
将模型拟合到这些组中,然后根据请求检索y-hat。
我接受了您的代码,但将其合并到使用mutate
计算新列的purrr::map
次调用中。
library(prophet)
library(dplyr)
library(purrr)
library(tidyr)
d1 <- df %>%
nest(-group) %>%
mutate(m = map(data, prophet)) %>%
mutate(future = map(m, make_future_dataframe, period = 7)) %>%
mutate(forecast = map2(m, future, predict))
以下是此时的输出:
d1
# A tibble: 2 × 5
group data m future
<fctr> <list> <list> <list>
1 A <tibble [30 × 2]> <S3: list> <data.frame [36 × 1]>
2 B <tibble [30 × 2]> <S3: list> <data.frame [36 × 1]>
# ... with 1 more variables: forecast <list>
然后我使用unnest()
从forecast
列中检索数据,并根据要求选择y-hat值。
d <- d1 %>%
unnest(forecast) %>%
select(ds, group, yhat)
以下是新预测值的输出:
d %>% group_by(group) %>%
top_n(7, ds)
Source: local data frame [14 x 3]
Groups: group [2]
ds group yhat
<date> <fctr> <dbl>
1 2016-11-30 A 180.53422
2 2016-12-01 A 349.30277
3 2016-12-02 A 288.68215
4 2016-12-03 A 222.33501
5 2016-12-04 A 342.96654
6 2016-12-05 A 203.64625
7 2016-12-06 A 185.37395
8 2016-11-30 B 131.07827
9 2016-12-01 B 222.83703
10 2016-12-02 B 236.33555
11 2016-12-03 B 145.41001
12 2016-12-04 B 228.59687
13 2016-12-05 B 162.49244
14 2016-12-06 B 68.44477
答案 1 :(得分:4)
我正在为同样的问题寻找解决方案。我提出了以下代码,这比接受的答案简单一点。
library(tidyr)
library(dplyr)
library(prophet)
data = df %>%
group_by(group) %>%
do(predict(prophet(.), make_future_dataframe(prophet(.), periods = 7))) %>%
select(ds, group, yhat)
以下是预测值
data %>% group_by(group) %>%
top_n(7, ds)
# A tibble: 14 x 3
# Groups: group [2]
ds group yhat
<date> <fctr> <dbl>
1 2016-12-01 A 316.9709
2 2016-12-02 A 258.2153
3 2016-12-03 A 196.6835
4 2016-12-04 A 346.2338
5 2016-12-05 A 208.9083
6 2016-12-06 A 216.5847
7 2016-12-07 A 206.3642
8 2016-12-01 B 230.0424
9 2016-12-02 B 268.5359
10 2016-12-03 B 190.2903
11 2016-12-04 B 312.9019
12 2016-12-05 B 266.5584
13 2016-12-06 B 189.3556
14 2016-12-07 B 168.9791