用于循环回归估计(ECM模型)

时间:2017-12-14 13:26:30

标签: r for-loop lm

我正在估算我的数据的时间序列误差修正模型(使用包' ecm')。在下面的代码中,您可以看到我使用xeq和xtr指定短期和长期变量。

这些变量是独立变量并估计因变量:Sales。

在这种情况下,它是一个汇集模型,但我想逐个单元地估计这个模型(每个品牌都是如此)。由于我的数据集相当大,由360个产品类别组成,每个产品类别有3个品牌(品牌2,品牌3和品牌4)。

xeq <- DatasetThesisSynergyClean[c('lnPrice', 'lnAdvertising', 'lnDisplay', 'IntrayearCycles', 'lnCompetitorPrices', 'lnCompADV', 'lnCompDISP' , 'ADVxDISP', 'ADVxCYC', 'DISPxCYC', 'ADVxDISPxCYC')]     
xtr <- DatasetThesisSynergyClean[c('lnPrice', 'lnAdvertising', 'lnDisplay', 'IntrayearCycles', 'lnCompetitorPrices', 'lnCompADV', 'lnCompDISP', 'ADVxDISP',  'ADVxCYC', 'DISPxCYC', 'ADVxDISPxCYC')]     
model11 <- ecm(DatasetThesisSynergyClean$lnSales, xeq, xtr, includeIntercept=TRUE)
summary(model11)

我想要的是为每个类别的每个品牌生成一个输出。为了让您了解我的数据,请运行以下代码:

structure(list(Week = 7:17, Category = c("2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2"), Brand = c("3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3"), Display = c(0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0), Sales = c(0, 0, 0, 0, 13.440948, 40.097397, 
32.01384, 382.169189, 2830.748779, 4524.460938, 1053.590576), 
    Price = c(0, 0, 0, 0, 5.949999, 5.95, 5.950003, 4.87759, 
    3.787015, 3.205987, 4.898724), Distribution = c(0, 0, 0, 
    0, 1.394019, 1.386989, 1.621416, 8.209759, 8.552915, 9.692097, 
    9.445554), Advertising = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0), lnSales = c(11.4945151554497, 11.633214247508, 11.5862944141137, 
    11.5412559646132, 11.4811122484454, 11.4775106999991, 11.6333660772506, 
    11.4859819773102, 11.5232680456161, 11.5572670584292, 11.5303686934256
    ), IntrayearCycles = c(4.15446534315765, 3.62757053512638, 
    2.92387946552647, 2.14946414386239, 1.40455011205262, 0.768856938870769, 
    0.291497141953598, -0.0131078404184544, -0.162984144025091, 
    -0.200882782749248, -0.182877633924882), `Competitor Advertising` = c(10584.87063, 
    224846.3243, 90657.72553, 0, 0, 0, 2396.54212, 0, 0, 0, 40343.49444
    ), `Competitor Display` = c(0.385629, 2.108133, 2.515806, 
    4.918288, 3.81749, 3.035847, 2.463194, 3.242594, 1.850399, 
    1.751096, 1.337943), `Competitor Prices` = c(5.30989, 5.372752, 
    5.3717245, 5.3295525, 5.298393, 5.319466, 5.1958415, 5.2941095, 
    5.296757, 5.294059, 5.273578), ZeroSales = c(1, 1, 1, 1, 
    0, 0, 0, 0, 0, 0, 0)), .Names = c("Week", "Category", "Brand", 
"Display", "Sales", "Price", "Distribution", "Advertising", "lnSales", 
"IntrayearCycles", "Competitor Advertising", "Competitor Display", 
"Competitor Prices", "ZeroSales"), row.names = 1255:1265, class = "data.frame")

如您所见,我将所有类别和品牌存储在行中。为了估算每个品牌,我想写一个for循环,但我真的不知道如何指定正确的类别和品牌,以便单独保存这个输出。

最终想要存储系数,std。 4个独立数据框中所有品牌的误差,t值和p值。但首先我需要获得输出,你能帮助我吗?

2 个答案:

答案 0 :(得分:0)

我建议你看看一些tidyverse软件包,并考虑使用结合split(df, list(df$Category, df$Group))和purrr map()函数的矢量化方法将函数应用于每个较小的软件包数据集。代码将是这样的:

df %>% 
  split(f = list(.$Category, .$Brand)) %>% 
  map(a_function_for_each_group) %>%
  bind_rows()

我希望我能正确理解你的问题。

答案 1 :(得分:0)

您可以像这样使用dplyr

f <- function(.) {
  xeq <- as.data.frame(select(., lnPrice, lnAdvertising, lnDisplay, IntrayearCycles, lnCompetitorPrices, lnCompADV, lnCompDISP, ADVxDISP, ADVxCYC, DISPxCYC, ADVxDISPxCYC))
  xtr <- as.data.frame(select(., lnPrice, lnAdvertising, lnDisplay, IntrayearCycles, lnCompetitorPrices, lnCompADV, lnCompDISP, ADVxDISP,  ADVxCYC, DISPxCYC, ADVxDISPxCYC))
  print(xeq)
  print(xtr)
  summary(ecm(.$lnSales, xeq, xtr, includeIntercept = TRUE))
}


Models <- DatasetThesisSynergyClean %>% 
  group_by(Category, Brand) %>% 
  do(Model = f(.))


Models$Category
[1] "2" "3"
Models$Brand
[1] "3" "3"
Models$Model
[[1]]

Call:
lm(formula = dy ~ ., data = x)
# ... and so on

您最终得到3个项目(类别,品牌和型号摘要对象)的列表,长度等于唯一类别/品牌组合。无法正确尝试,因为我没有完整的数据。类别3,品牌3的模型摘要:

Models$Model[[which(Models$Category == 3 & Models$Brand == 3)]]

<强>更新

如果您需要每个模型的独立对象,您可以为它们指定相应的名称并使用list2env()

names(Models$Model) <- paste0("C", Models$Category, "B", Models$Brand)
list2env(Models$Model, .GlobalEnv)