我有这种格式的数据
lid=structure(list(data_user_create.order = structure(c(1L, 1L, 1L,
2L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 12L,
12L, 12L, 12L, 13L, 13L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L,
15L, 16L, 16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L, 18L, 18L, 18L,
19L, 19L, 19L, 20L, 20L, 21L, 21L, 21L, 21L, 21L, 21L, 22L, 22L,
22L, 22L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 25L, 25L, 26L, 26L,
26L, 26L, 26L, 26L, 27L, 27L, 27L, 28L, 28L, 28L, 29L, 29L, 30L,
31L, 31L, 31L, 32L, 32L, 32L, 32L, 33L, 33L, 33L, 33L, 33L, 33L,
33L, 34L, 34L, 34L, 35L, 35L, 35L, 35L), .Label = c("24.08.2017 10:26",
"24.08.2017 10:27", "24.08.2017 10:28", "24.08.2017 10:29", "24.08.2017 10:30",
"24.08.2017 10:31", "24.08.2017 10:32", "24.08.2017 10:34", "24.08.2017 10:37",
"24.08.2017 10:38", "24.08.2017 10:39", "24.08.2017 10:40", "24.08.2017 10:42",
"24.08.2017 10:43", "24.08.2017 10:44", "24.08.2017 10:45", "24.08.2017 10:46",
"24.08.2017 10:47", "24.08.2017 10:48", "24.08.2017 10:49", "24.08.2017 10:50",
"24.08.2017 10:51", "24.08.2017 10:52", "24.08.2017 10:53", "24.08.2017 10:54",
"24.08.2017 10:55", "24.08.2017 10:56", "24.08.2017 10:57", "24.08.2017 10:58",
"24.08.2017 10:59", "24.08.2017 11:00", "24.08.2017 11:01", "24.08.2017 11:02",
"24.08.2017 11:03", "24.08.2017 11:04"), class = "factor"), goods = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Ecoslim", "Hammer of Thor"
), class = "factor"), id_users = 1:119), .Names = c("data_user_create.order",
"goods", "id_users"), class = "data.frame", row.names = c(NA,
-119L))
data_user_create_create.order
是人们在商店中创建订单的时候
goods
是商品类型
id_users
是id_users它不重要的列。
不同的人可以同时订购相同的产品。
因此,对于每种商品,我都必须以dd-mm-yyyy-hh-mm-ss格式创建预测以预测事件日期
我的尝试
library("forecast")
my_forecast <- function(x){
model <- arima(x, order = c(1, 1, 1))
fcast <- forecast(model, 2)
return(fcast)
}
#applying it as follows
lapply(lid[1], my_forecast)
和错误:
Error in arima(x, order = c(1, 1, 1)) : non-stationary AR part from CSS
我认为arima
不能将事件作为事件的事实进行预测,因为我尝试预测的不是度量值var,而是事件发生的下一个日期(人员创建顺序)
即所需的输出。
date person createon type of goods
24.08.2017 11:04:02 Ecoslim
24.08.2017 11:04:38 Ecoslim
24.08.2017 11:05:45 Ecoslim
24.08.2017 11:04:02 pillow
24.08.2017 11:04:38 pillow
24.08.2017 11:05:45 pillow
如何预测活动日期?
答案 0 :(得分:0)
通常,对于这种情况,我会使用回归模型。
https://www.machinelearningplus.com/machine-learning/complete-introduction-linear-regression-r/
或者,考虑使用时间序列预测。
https://www.pluralsight.com/guides/time-series-forecasting-using-r
但是...您的数据集看起来很稀疏。您有ID(在ML世界中实际上是无用的)以及日期和商品。逻辑是什么?在训练机器以进行预测(监督学习)之前,人类必须先知道如何进行预测。