我使用R进行基于代理的历史模拟,代码工作起来却很慢。它循环通过时间步骤更新代理属性的数据帧,而另一个用于在每个时间步长(一代)之后总体状态的摘要。在上面循环是每个不同参数设置的几次运行。虽然它以100个代理开始,但在极端设置(高S,低A)之后,例如五代人口可以增长到一千以上。我读到更新矩阵比数据帧更快,所以我将汇总转换为矩阵。但是我也听说矢量化是最好的,所以在我将代理更改为矩阵之前我想知道是否有人可以建议一种方法来使其更加矢量化?这是代码:
NextGeneration <- function(agent, N, S, A) {
# N is number of agents.
# S is probability that an agent with traditional fertility will have 2 sons surviving to the age of inheritance.
# A is probability that an heir experiencing division of estate changes his fertility preference from traditional to planned.
# find number of surviving heirs for each agent
excess <- runif(N) # get random numbers
heir <- rep(1, N) # everyone has at least 1 surviving heir
# if agent has traditional fertility 2 heirs may survive to inherit
heir[agent$fertility == "Trad" & excess < S] <- 2
# next generation more numerous if spare heirs survive
# new agents have vertical inheritance but also guided variation.
# first append to build a vector, then combine into new agent dataframe
nextgen.fertility <- NULL
nextgen.lineage <- NULL
for (i in 1:N) {
if (heir[i]==2) {
# two agents inherit from one parent.
for (j in 1:2) {
# A is probability of inheritance division event affecting fertility preference in new generation.
if (A > runif(1)) {
nextgen.fertility <- c(nextgen.fertility, "Plan")
} else {
nextgen.fertility <- c(nextgen.fertility, agent$fertility[i])
}
nextgen.lineage <- c(nextgen.lineage, agent$lineage[i])
}
} else {
nextgen.fertility <- c(nextgen.fertility, agent$fertility[i])
nextgen.lineage <- c(nextgen.lineage, agent$lineage[i])
}
}
# assemble new agent frame
nextgen.agent <- data.frame(nextgen.fertility, nextgen.lineage, stringsAsFactors = FALSE)
names(nextgen.agent) <- c("fertility", "lineage")
nextgen.agent
}
所以代理人就这样开始(Trad =传统):
ID fertility lineage,
1 Trad 1
2 Trad 2
3 Trad 3
4 Trad 4
5 Trad 5
经过几次步骤(几代)的随机变化后,结果如下:
ID fertility lineage
1 Plan 1
2 Plan 1
3 Trad 2
4 Plan 3
5 Trad 3
6 Trad 4
7 Plan 4
8 Plan 4
9 Plan 4
10 Plan 5
11 Trad 5
答案 0 :(得分:0)
实际上,使用0和1对fertility
进行编码会更有效,甚至可以使用整数矩阵。
无论如何,现有的代码可以简化很多 - 所以这里是一个矢量化的解决方案,仍在使用你的data.frame
:
NextGen <- function(agent, N, S, A) {
excess <- runif(N)
v1 <- which(agent$fertility == "Trad" & excess < S)
nextgen.agent <- agent[c(1:N, v1), ]
nextgen.agent[c(v1, seq.int(N+1, nrow(nextgen.agent))), "fertility"] <- ifelse(A > runif(length(v1)*2), "Plan", "Trad")
nextgen.agent
}
使用样本agent
DF进行测试,如下所示:
agentDF <- data.frame(fertility = "Trad", lineage = 1:50, stringsAsFactors = FALSE)
# use microbenchmark library to compare performance
microbenchmark::microbenchmark(
base = {
res1 <- NextGeneration(agentDF, 50, 0.8, 0.8) # note I fixed the two variable typos in your function
},
new = {
res2 <- NextGen(agentDF, 50, 0.8, 0.8)
},
times = 100
)
## Unit: microseconds
## expr min lq mean median uq max neval
## base 1998.533 2163.8605 2446.561 2222.8200 2286.844 14413.173 100
## new 282.032 304.1165 329.552 320.3255 348.488 467.217 100