我正在构建一个R脚本,用于多次查询数据库(一个用于3个向量的元素的每个排列,但是我很难弄清楚如何使用ldply
来实现这个。
tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")
query <- function(tag, time, timestep) {
sql <- paste("select tag, time, timestep, value from mydb where tag = '",tag,"' and time = '",time,"' and timestep = '",timestep,"'", sep="")
# pretend the line below is actually querying a database and returning a DF with one row
data.frame(tag = tag, time = time, timestep = timestep, value = rnorm(1))
}
# function works correctly!
query(time = times[1], tag = tags[1], timestep = timesteps[1])
# causes an error! (Error in FUN(X[[1L]], ...) : unused argument(s) (X[[1]]))
ldply(times, query, time = times, tag = tags, timestep = timesteps)
我以为我可以使用ldply嵌套三次,每个向量一次,但我甚至没有超出第一级!
任何想法我能做什么?
答案 0 :(得分:3)
如果您使用mdply
(或等效地只是mapply
),我认为这会大大简化:
tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")
query <- function(tags, times, timesteps) {
sql <- paste("select tag, time, timestep, value from mydb where
tag = '",tags,"' and time = '",times,"' and timestep = '",timesteps,"'", sep="")
# pretend the line below is actually querying a database and returning a DF with one row
data.frame(tag = tags, time = times, timestep = timesteps, value = rnorm(1))
}
dat <- expand.grid(tags, times, timesteps)
colnames(dat) <- c('tags','times','timesteps')
mdply(dat,query)
注意变量名称的微小变化,使它们在数据和函数参数之间都是一致的。
答案 1 :(得分:1)
这将完成工作,但它只使用apply。首先,我使用感兴趣的组合创建一个对象,然后我重写查询以从该对象中取一行而不是3个输入。
tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")
# Use expand.grid to create an object with all the combinations
dat <- expand.grid(tags, times, timesteps)
# Rewrite query to take in a row of dat
query <- function(row) {
# extract the pieces of interest
tag <- row[1]
time <- row[2]
timestep <- row[3]
sql <- paste("select tag, time, timestep, value from mydb where tag = '",tag,"' and time = '",time,"' and timestep = '",timestep,"'", sep="")
# pretend the line below is actually querying a database and returning a DF with one row
data.frame(tag = tag, time = time, timestep = timestep, value = rnorm(1))
}
# function works correctly on a single row
query(dat[1,])
# apply the function to each row
j <- apply(dat, 1, query)
# bind all the output together
do.call(rbind, j)