我正在使用具有横截面数据的Marketing Mix模型进行工作,并尝试拟合一个将分别对每个单元应用AdStock / Decay转换的函数。
广告库存转换假设广告不会立即影响客户,而是及时分发。例如:如果我们在第1周产生了100个GRP,而衰减率是0.5,则只有50个会影响第1周的销售额,第2周的25,第3周的12,5,依此类推。
我能够为汇总数据编写一个循环,但是很难将其拟合为媒体变量和衰减率的函数,该函数针对横截面数据中的每个单元分别运行。
# Preparing some data
dt1 <- data.frame(time = c(1:8), var = c(100,0,0,0,200,0,0,0))
# defining AdStock rate
adstock_rate <- 0.5
# A loop for adstocked variable
for (i in 1:length(dt1$var)){
if (i == 1) {
dt1$adstocked_advertising[i] = dt1$var[i] * adstock_rate }
else {
dt1$adstocked_advertising[i] = adstock_rate * dt1$var[i] + (1 - adstock_rate) * dt1$adstocked_advertising[i-1]
} }
结果是
time var adstocked_advertising
1 1 100 50.00000
2 2 0 25.00000
3 3 0 12.50000
4 4 0 6.25000
5 5 200 103.12500
6 6 0 51.56250
7 7 0 25.78125
8 8 0 12.89062
如何编写在横截面数据情况下可以使用的函数?例如,在此数据集中:
# Cross-sectional data
dt <- data.frame(location = rep(letters[1:2], each = 4), time = rep(1:4, 2), var = c(100,0,0,0,200,0,0,0))
# Data Frame
location time var
1 a 1 100
2 a 2 0
3 a 3 0
4 a 4 0
5 b 1 200
6 b 2 0
7 b 3 0
8 b 4 0
现在,我需要编写AdstockTransform(var,adstock_rate)函数,该函数将把第一个示例中的循环分别应用于每个位置。这很重要,因为随后应使用nlsLM之类的方法优化adstock_rate参数。
我非常感谢您的帮助。
编辑:谢谢,冻糕,您的解决方案有很大帮助。您认为可以将其用作一项功能吗?我将尝试解释为什么使用filter函数的半解法如此重要。
library(bayesm)
library(minpack.lm)
library(dplyr)
data(cheese)
# Function that describes both the AdStock and diminishing returns
adstockTransform <- function(x, as, beta){
stats::filter( 1/(1+exp(-beta*x)), as, method = "recursive")
}
mmm.data <- cheese %>%
group_by(RETAILER) %>%
mutate(log.volume = log(VOLUME), log.price = log(PRICE), adstock= adstockTransform(DISP, as, beta))
# Optimization of the parameters
fit <- nlsLM(log.volume ~ const + B1*adstockTransform(DISP, as, beta) + B2*log.price,
start = c(const = 10, B1 = 0.5, as = 0.2, beta = 2, B2 = -3),
lower = c(const = 5, B1 = 0.2, as = 0.1, beta = 1.5, B2 = -5),
upper = c(const = 12, B1 = 2, as = 0.4, beta = 6, B2 = -2),
data=mmm.data)
summary(fit)
# Output
Formula: log.volume ~ const + B1 * adstockTransform(DISP, as, beta) +
B2 * log.price
Parameters:
Estimate Std. Error t value Pr(>|t|)
const 9.77943 0.11588 84.392 < 2e-16 ***
B1 0.40081 0.11288 3.551 0.000387 ***
as 0.40000 0.12625 3.168 0.001542 **
beta 6.00000 2.86959 2.091 0.036583 *
B2 -2.00000 0.05896 -33.923 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.7709 on 5550 degrees of freedom
Number of iterations to convergence: 3
Achieved convergence tolerance: 1.49e-08
当然-为了找到固定效果模型的最佳参数,我应该在优化之前降低数据;但这是一个简单的例子。
尽管过滤器功能很有用,但它也有两个主要缺点:
1)在对销售变量进行建模的情况下,很难证明这一点。几何衰减适合其他变量-例如品牌知名度-更好
2)无法使用brms包对其进行编码
当然,它几乎与“所需的” adstock转换完全相关,但是跳过此附加过滤器-> adstock步骤将非常好。
答案 0 :(得分:0)
考虑by
来按位置对数据进行子集化并通过已定义的函数运行所需的计算,然后rbind
来处理所有数据框对象:
# Preparing some data
dt <- data.frame(location = rep(letters[1:2], each = 4),
time = rep(1:4, 2),
var = c(100,0,0,0,200,0,0,0))
# defining AdStock rate
adstock_rate <- 0.5
# define function to calculate column
calc_decay <- function(sub) {
# A loop for adstocked variable
for (i in 1:length(sub$var)){
if (i == 1) { sub$adstocked_advertising[i] <- sub$var[i] * adstock_rate }
else { sub$adstocked_advertising[i] <- sub$var[i] * adstock_rate +
(1 - adstock_rate) * sub$adstocked_advertising[i-1] }
}
return(sub)
}
# by call
df_list <- by(dt1, dt1$Location, calc_decay)
# rbind all elements
final_df <- do.call(rbind, df_list)
最终输出
final_df
# location time var adstocked_advertising
# 1 a 1 100 50.00
# 2 a 2 0 25.00
# 3 a 3 0 12.50
# 4 a 4 0 6.25
# 5 b 1 200 100.00
# 6 b 2 0 50.00
# 7 b 3 0 25.00
# 8 b 4 0 12.50