我有一个看起来像这样的数据框
Customer Mon1 Mon2 Mon3 Mon4 Mon5 Mon6 Mon7 Mon8 Mon9 Mon10 Mon11 Mon12
1 BS:100021 83.6140 76.7849 71.8369 66.8452 66.9263 53.5129 48.0321 44.5750 34.4080 40.8653 40.6369 24.8658
2 BS:100022 -0.5097 0.9198 -1.6027 -0.7160 -40.7443 -40.8863 -40.7049 -40.4623 -48.8805 -44.8879 -48.3559 -39.8656
3 BS:100025 106.4243 102.3998 100.9183 99.1006 98.1092 95.8125 94.2770 95.9911 92.1445 94.0984 87.7465 86.1946
4 BS:100037 37.5871 37.5888 37.5905 37.5924 37.5941 37.5957 37.5977 37.5993 37.5797 50.8395 50.8416 37.6064
5 BS:100050 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
6 BS:100056 780.4214 88.0918 88.9721 88.8143 90.9508 97.9842 96.6309 101.7312 84.7743 76.5239 133.6655 86.3668
7 BS:100063 15.1694 15.1993 17.6528 19.6854 23.9929 27.5048 18.1503 19.8184 17.3152 17.3084 18.4588 24.0067
8 BS:100079 0.0292 0.0827 0.3120 0.1206 1.6245 2.3239 2.5857 0.1718 0.4340 0.6849 3.2916 2.2456
9 BS:100089 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
10 BS:100091 0.1324 2.8137 0.9854 0.2405 0.2811 0.1312 0.0174 2.4304 0.7994 0.5884 0.2618 0.1233
现在,我想为未来12个月的每位客户创建另一个数据框 我不知道如何使用接下来的12个月数据创建另一个数据框。例如,我尝试过
make_new_data <- function(x){
require(forecast)
ts_object <- as.numeric(x[-1])
forecasted_data <- data.frame(naive(ts_object,12))[,1]
new_data <- c(x[1],as.character(forecasted_data))
return(new_data)
}
z <- apply(test_data_entry,1,make_new_data)
这显然无效c(x[1],as.character(forecasted_data))
没有制作像。的载体。
1 BS:100021 83.614 76.7849 71.8369 66.8452 66.9263 53.5129 48.0321 44.575 34.408 40.8653 40.6369 24.8658
我如何让它工作,其次是有办法以更快的方式做到这一点
答案 0 :(得分:2)
你有宽格式的数据,快速(我也会说更好)的方式是使用长格式。我的方法使用data.table。
首先,您需要使用melt
- 包的reshape2
- 函数将数据从宽变换为长。鉴于您的数据是什么样的,它应该是这样的(假设您的数据被称为test_data_entry
):
library(data.table)
library(reshape2)
dt.wide <- as.data.table(test_data_entry)
dt.long <- melt(dt.wide, id.vars = "Customer")
# doing some minor changes
dt <- dt.long[, .(Customer,
Month = as.numeric(gsub("Mon", "", variable)),
value)] # replace the MonX with X etc.
dt[, forecasted.value := as.data.frame(naive(value, 12))[,1],
by = Customer]
dt[order(Customer)]
# Customer Month value forecasted.value
# 1: BS:100021 1 83.6140 24.8658
# 2: BS:100021 2 76.7849 24.8658
# 3: BS:100021 3 71.8369 24.8658
# 4: BS:100021 4 66.8452 24.8658
# 5: BS:100021 5 66.9263 24.8658
# ---
# 116: BS:100091 8 2.4304 0.1233
# 117: BS:100091 9 0.7994 0.1233
# 118: BS:100091 10 0.5884 0.1233
# 119: BS:100091 11 0.2618 0.1233
# 120: BS:100091 12 0.1233 0.1233
你是对的,as.character
混乱。这应该解决它:
library(forecast)
make_new_data <- function(x){
ts_object <- as.numeric(x[-1])
forecasted_data <- data.frame(naive(ts_object, 12))[, 1]
new_data <- data.frame(as.character(x[1]), t(forecasted_data))
names(new_data) <- c("Customer", paste0("Mon", 1:12))
return(new_data)
}
z <- data.table::rbindlist(apply(test_data_entry,1,make_new_data))
z
# Customer Mon1 Mon2 Mon3 Mon4 Mon5 Mon6 Mon7 Mon8 Mon9
# 1: BS:100021 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658 24.8658
# 2: BS:100022 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656 -39.8656
# 3: BS:100025 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946 86.1946
# 4: BS:100037 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064 37.6064
# 5: BS:100050 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
# 6: BS:100056 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668 86.3668
# 7: BS:100063 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067 24.0067
# 8: BS:100079 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456 2.2456
# 9: BS:100089 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
# 10: BS:100091 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233 0.1233
# Mon10 Mon11 Mon12
# 1: 24.8658 24.8658 24.8658
# 2: -39.8656 -39.8656 -39.8656
# 3: 86.1946 86.1946 86.1946
# 4: 37.6064 37.6064 37.6064
# 5: 0.0000 0.0000 0.0000
# 6: 86.3668 86.3668 86.3668
# 7: 24.0067 24.0067 24.0067
# 8: 2.2456 2.2456 2.2456
# 9: 0.0000 0.0000 0.0000
# 10: 0.1233 0.1233 0.1233