我真的很难创建一个运行模型的函数,其中包含所有变量a
,b
,d
,g
& N
有多个版本,如下面的data.table所示,我将其命名为crm
:
crm = data.table(
East = 26500,
North = c(115000, 120000, 125000, 130000, 135000, 140000),
rain = c(1049.61, 1114.31, 1361.61, 1407.2, 1499.56, 1654.13),
crop = 'Wheat', area = c(0.1718, 0.1629, 0.1082, 0.0494, 0.02, 0.004),
rn = c("10007", "10018", "10023", "10024", "10025", "10026"),
N1 = 184.262648839489, N2 = 180.312874871521, N3 = 178.615847839997,
N4 = 182.531626054579, a1 = 0.186117715072018, a2 = -0.0232731908915799,
a3 = 0.227017532149122, a4 = 0.162943230565506, b1 = 0.000478900233700419,
b2 = 0.000787931973696371, b3 = 0.000458478256537521, b4 = 0.000517304324750896,
d1 = -0.000328164576390286, d2 = -0.000112122093240884, d3 = 0.000112702113716146,
d4 = 7.40875908059628e-05, g1 = 4.04709473710477e-06, g2 = 3.68724096485995e-06,
g3 = 3.47214450131546e-06, g4 = 3.55825543257538e-06, key = 'rn'
)
我要做的是运行下面的函数来计算lnN
的值,并将其放入标题中与输入到模型中的变量具有相同编号的列中。即使用a1
,b1
,d1
,g1
& N1
将为所有2s,3s和4s生成列lnN1
等等。
n <- 1:4
cols <- paste0("lnN",n)
for(i in 1:length(n)){
crm[,(cols) := lapply(.SD ,function (x) {
N = crm[,7+i]
a = crm[,11+i]
b = crm[,15+i]
d = crm[,19+i]
g = crm[,23+i]
a + (b*crm[,rain]) + (g*N) + (d*crm[,rain]*N)}), .SDcols = paste0("N",n)]
}
我还没有找到一个关于如何实现这个目标的例子。我已经尝试使用mapply
,但我看不到如何迭代每个变量的所有迭代。谢谢你的帮助!
答案 0 :(得分:2)
怎么样:
library(dplyr)
cbind(crm, do.call(cbind,
lapply(1:4, function(x) {
select(crm, c(contains(as.character(x)), rain)) %>%
setnames(gsub("[0-9]", "", names(.))) %>%
transmute(lnN = a + (b*rain) + (g*N) + (d*rain*N)) %>%
setnames(paste0("lnN", x))
})
))
主要思想是,对于每个数字,只选择包含数字的列(以及rain
),重命名列以删除数字,应用公式,重命名结果列以附加数字,然后cbind
将结果放到原始表格上。
答案 1 :(得分:0)
所以过去看了上面的评论,并意识到在尝试将迭代次数变为数千次时可能存在问题,因为Frank和Weihuang都建议我重新考虑如何构建数据。
我所做的是将随机生成的变量矩阵作为单独的数据帧。 e
包含a
,b
,g
&amp;的多变量随机值。在d
中N
Nit
{现在称为rnm
}。所以crm
现在只有前六列。代码如下所示:
for(i in 1:n){
a = e[1,i]
b = e[2,i]
g = e[3,i]
d = e[4,i]
Nit = rnm[i]
bob = a + (b*crm$rain) + (g*Nit) + (d*crm$rain*Nit)
data.y <- cbind(data.y, bob)
}
crm <- cbind(crm, data.y)
names(crm)[c(7:n)] = names(bobs)
对于1:n的每次迭代,它读取每个参数的i
值(所有1,所有2等)并将其放入模型中并创建一个名为{{1}的列}。然后将bob
合并到我在函数(bob
)之前创建的空数据框中。这将循环,直到达到所需的循环次数。
然后我使用data.y
将两者合并在一起,然后使用存储在数据框cbind
中的名称依次重命名所有bob
列,该数据框包含编号为{的列标题列表从我在Excel中生成的bobs
文件中读取的bob.1
到bob.n
。
答案 2 :(得分:0)
此处&#39; sa melt
- 和 - dcast
(recast
)版本利用melt
的{{1}}功能patterns
来利用提供的名称。有关说明,请参阅newly implemented,因为这是目前正在开发的功能。
library(data.table) 1.10.5+
# create character version of 1:N (Number of output columns)
N = paste0(seq_len(length(grep('^b', names(crm)))))
# join crm to a melt & recast version of itself using rain as
# the join key (note this will fail if the amount of rain may
# not be unique -- in this case, we should include some ID in
# id.vars, like rn, and adjust accordingly)
crm = crm[crm[ , melt(.SD, id.vars = 'rain',
measure.vars = patterns(N = '^N[0-9]', a = '^a[0-9]',
b = '^b', d = '^d', g = '^g'))
# use the formula to generate ln
][ , ln := a + b*rain + g * N + d * rain * N
# reshape wide
][ , dcast(.SD, rain ~ variable, value.var = 'ln')
# rename the columns here
][ , setnames(.SD, N, paste0('ln', N))],
on = 'rain']
# by-reference version
crm[crm[ , melt(.SD, id.vars = 'rain',
measure.vars = patterns(N = '^N[0-9]', a = '^a[0-9]',
b = '^b', d = '^d', g = '^g'))
# use the formula to generate ln
][ , ln := a + b*rain + g * N + d * rain * N
# reshape wide
][ , dcast(.SD, rain ~ variable, value.var = 'ln')],
# mget tends to be sort of slow, which is why I used the
# assign-by-copy approach first above; in larger examples,
# this slow-down may be outweighed by the cost of copying
paste0('ln', N) := mget(N), on = 'rain']