如何在几个向量的排列上使用ldply?

时间:2012-08-08 14:31:34

标签: r plyr

我正在构建一个R脚本,用于多次查询数据库(一个用于3个向量的元素的每个排列,但是我很难弄清楚如何使用ldply来实现这个。

tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")


query <- function(tag, time, timestep) {

  sql <- paste("select tag, time, timestep, value from mydb where tag = '",tag,"' and time = '",time,"' and timestep = '",timestep,"'", sep="")

  # pretend the line below is actually querying a database and returning a DF with one row
  data.frame(tag = tag, time = time, timestep = timestep, value = rnorm(1))

}
# function works correctly!  
query(time = times[1], tag = tags[1], timestep = timesteps[1])

# causes an error! (Error in FUN(X[[1L]], ...) : unused argument(s) (X[[1]]))
ldply(times, query, time = times, tag = tags, timestep = timesteps)

我以为我可以使用ldply嵌套三次,每个向量一次,但我甚至没有超出第一级!

任何想法我能做什么?

2 个答案:

答案 0 :(得分:3)

如果您使用mdply(或等效地只是mapply),我认为这会大大简化:

tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")


query <- function(tags, times, timesteps) {

  sql <- paste("select tag, time, timestep, value from mydb where 
            tag = '",tags,"' and time = '",times,"' and timestep = '",timesteps,"'", sep="")
  # pretend the line below is actually querying a database and returning a DF with one row
  data.frame(tag = tags, time = times, timestep = timesteps, value = rnorm(1))

}

dat <- expand.grid(tags, times, timesteps)
colnames(dat) <- c('tags','times','timesteps')

mdply(dat,query)

注意变量名称的微小变化,使它们在数据和函数参数之间都是一致的。

答案 1 :(得分:1)

这将完成工作,但它只使用apply。首先,我使用感兴趣的组合创建一个对象,然后我重写查询以从该对象中取一行而不是3个输入。

tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")

# Use expand.grid to create an object with all the combinations
dat <- expand.grid(tags, times, timesteps)

# Rewrite query to take in a row of dat
query <- function(row) {
    # extract the pieces of interest
    tag <- row[1]
    time <- row[2]
    timestep <- row[3]

    sql <- paste("select tag, time, timestep, value from mydb where tag = '",tag,"' and time = '",time,"' and timestep = '",timestep,"'", sep="")

    # pretend the line below is actually querying a database and returning a DF with one row
    data.frame(tag = tag, time = time, timestep = timestep, value = rnorm(1))

}

# function works correctly on a single row  
query(dat[1,])

# apply the function to each row
j <- apply(dat, 1, query)
# bind all the output together
do.call(rbind, j)