简而言之,我有一个更大的函数来创建data.frames,它们是更大的data.frame的子集,并以函数的参数命名。它正在构建原始数据的数据框架以及Holt-Winters的输出和预测输出...这意味着它正在创建多个data.frames。下面是一个小例子(虽然这里没有足够的间隔来实际生成ts类data.frame):
Group <- c("Primary_Group","Primary_Group","Primary_Group","Primary_Group","Primary_Group","Primary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Secondary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group","Tertiary_Group")
Day <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
Type <- c("A","A","A","B","B","B","A","A","A","B","B","B","A","A","A","B","B","B")
Value <- c(7,3,10,3,9,4,0,9,3,10,1,6,3,4,10,2,3,1)
df <- as.data.frame(cbind(Group,Day,Type,Value))
Fun <- function(Group,Type, A, B, G){
df <- Data[Data$Group== Group & Data$Type== Type, ]
assign(paste(Group,Type,"_df",sep = ''), df, envir = parent.frame())
df_holtwinters <- HoltWinters(ts(Data[Data$Group== Group & Data$Type== Type, ],
frequency = 365), alpha = A, beta = B, gamma = G)
assign(paste(Group,Type,"_hw",sep = ''), df_holtwinters, envir = parent.frame())
}
您会注意到组和类型是字符,而 A,B,G 是数字或{{1} }。
如果我现在有一个由列表值组成的data.frame,我怎样才能最好地循环上面的函数(可能是NULL
)来使用第一行中每列的值...然后每列来自第2行等 - 创建多个数据帧。
mapply
理想情况下,我会获得以下data.frames来生成...
argGroup <- c("Primary_Group","Primary_Group","Secondary_Group","Secondary_Group","Tertiary_Group","Tertiary_Group")
argType <- c("A","B","A","B","A","B")
argA <- c(NA, NA, NA, NA, NA, NA)
argB <- c(0.05, 0.05, NA, NA, NA, NULL)
argG <- c(NA, NA, NA, NA, NA, NA)
argGroup[is.na(argGroup)] <- list(NULL)
argType[is.na(argType)] <- list(NULL)
argA[is.na(argA)] <- list(NULL)
argB[is.na(argB)] <- list(NULL)
argG[is.na(argG)] <- list(NULL)
Arguments <- cbind(argType, argType, argA, argB, argG)
了解如何最好(最自动化)Primary_Group_A_df
Primary_Group_A_hw
Primary_Group_B_df
Primary_Group_B_hw
Secondary_Group_A_df
Secondary_Group_A_hw
Secondary_Group_B_df
Secondary_Group_B_hw
Tertiary_Group_A_df
Tertiary_Group_A_hw
Tertiary_Group_B_df
Tertiary_Group_B_hw
所有 _df 和所有 _hw 一起使用也很有帮助。
任何帮助都会令人惊叹并且非常感激。非常感谢!
答案 0 :(得分:0)
您使用as.data.frame(cbind(...))
丢失了类型信息,
只需直接使用data.frame
:
Data <- data.frame(
Group = rep(c("Primary_Group", "Secondary_Group", "Tertiary_Group"), each = 6L),
Day = rep(1L:3L, 6L),
Type = rep(rep(c("A", "B"), each = 3L), 3L),
Value = c(7,3,10,3,9,4,0,9,3,10,1,6,3,4,10,2,3,1)
)
之后,我认为你可以做到以下几点:
split_data <- split(Data, as.list(Data[, c("Group", "Type")]))
dfs <- do.call(rbind, split_data)
dfs_hw <- lapply(split_data, function(sub_data) {
Map(argA, argB, argG, f = function(A, B, G) {
HoltWinters(ts(sub_data, frequency = 365), alpha = A, beta = B, gamma = G)
})
})
dfs_hw <- do.call(rbind, unlist(dfs_hw, recursive = FALSE))
但我从HoltWinters
收到错误,
所以我不能肯定地说。
另外,我认为dfs
只是再次Data
,只是重新排序。
答案 1 :(得分:0)
避免使用许多类似结构的对象充斥您的全局环境。考虑使用诸如列表之类的容器来容纳许多数据帧。一种有用的方法是by
通过一个或多个因子(例如 Group 和 Type )对数据帧进行子集化,以返回数据帧列表。此外,不要按行迭代,而是NULL
个参数与每个子集的一次参数传递的数据。
具体而言,为 df 和 hw 列表调用"NULL"
两次。但首先,通过 Group 和 Type 合并 df 和 Arguments 数据框。一个挑战是HW
无法存储在数据框中,因此请考虑保存as.numeric
字符串并指定临时变量以传递到Group <- c("Primary_Group","Primary_Group","Secondary_Group","Secondary_Group",
"Tertiary_Group","Tertiary_Group")
Type <- c("A","B","A","B","A","B")
argA <- c("NULL", "NULL", "NULL", "NULL", "NULL", "NULL")
argB <- c(0.05, 0.05, "NULL", "NULL", "NULL", "NULL")
argG <- c("NULL", "NULL", "NULL", "NULL", "NULL", "NULL")
Arguments <- data.frame(Group, Type, argA, argB, argG, stringsAsFactors=FALSE)
df <- merge(df, Arguments, by=c("Group", "Type"))
参数。不幸的是,这会将整个列转换为字符类型,您需要使用# ORDER FOR NAMING LATER
df <- with(df, df[order(Type, Group),])
# DATAFRAME LIST
df_list <- by(df, df[c("Group", "Type")], identity)
# RENAME LIST
df_list <- setNames(df_list, unique(paste0(df$Group, "_", df$Type, "_df")))
# REFERENCE ELEMENTS
df_list$Primary_Group_A_df
df_list$Secondary_Group_A_df
df_list$Tertiary_Group_A_df
...
转换为非NULL值。
<强>合并强>
# HW LIST
hw_list <- by(df, df[c("Group", "Type")], function(sub) {
# CONDITIONALLY ASSIGN TEMP VARIABLES
# (BEING SUBSETS: max(arg*)==min(arg*)==mean(arg*)==median(arg*))
if(!is.na(max(sub$argA)) & max(sub$argA) == "NULL") { tmpA <- NULL }
else { tmpA <- max(as.numeric(sub$argA)) }
if(!is.na(max(sub$argB)) & max(sub$argB) == "NULL") { tmpB <- NULL }
else { tmpB <- max(as.numeric(sub$argB)) }
if(!is.na(max(sub$argG)) & max(sub$argG) == "NULL") { tmpG <- NULL }
else { tmpG <- max(as.numeric(sub$argG)) }
# PASS ARGS ONCE PER SUBSET
return(HoltWinters(ts(sub, frequency = 365), alpha=tmpA, beta=tmpB, gamma=tmpG))
})
# RENAME LIST
hw_list <- setNames(hw_list, unique(paste0(df$Group, "_", df$Type, "_hw")))
# REFERENCE ELEMENTS
hw_list$Primary_Group_A_hw
hw_list$Secondary_Group_A_hw
hw_list$Tertiary_Group_A_hw
...
数据框列表 (带有命名的df元素)
> hw_list$Primary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.2169231
beta : 0.05
gamma: 0.1
Coefficients:
[,1]
a 2.89129621
b 0.08783715
s1 0.54815382
s2 -0.12485260
s3 0.21087038
> hw_list$Secondary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.752124
beta : 0
gamma: 0
Coefficients:
[,1]
a 3.691664e+00
b 3.333333e-01
s1 3.333333e-01
s2 -1.480388e-16
s3 -3.333333e-01
> hw_list$Tertiary_Group_A_hw
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = ts(sub[c("Group", "Day", "Type", "Value")], frequency = 3), alpha = tmpA, beta = tmpB, gamma = tmpG)
Smoothing parameters:
alpha: 0.3145406
beta : 0
gamma: 0
Coefficients:
[,1]
a 3.022946e+00
b -3.333333e-01
s1 -3.333333e-01
s2 -1.480388e-16
s3 3.333333e-01
HW列表 (带有命名的hw元素)
{{1}}
输出 (使用3表示硬件的频率与发布的数据保持一致)
{{1}}