RPostgres中的参数查询并将结果附加到新的数据框中

时间:2019-09-27 14:47:40

标签: r tidyverse rpostgresql

我在数据帧parameters中存储了一组配对值:

parameters <- data.frame(
   variant_id = c(1, 2, 3, 4, 5),
   start_date = c("2019-07-01", "2019-09-05", "2019-05-21", "2019-09-06",
                  "2019-04-19"))

> parameters
  variant_id start_date
1          1 2019-07-01
2          2 2019-09-05
3          3 2019-05-21
4          4 2019-09-06
5          5 2019-04-19

在RPostgres中执行的此SQL查询中,我想将variant_idstart_date的组合用作动态参数。

library(RPostgres)
library(tidyverse)

query <- "select sum(o.quantity)
from orders o
where o.date >= << start_date >>
and o.variant_id = << variant_id >> "

df <- dbGetQuery(db, query)

然后我将收到类似的查询:

query_1 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-07-01'
and o.variant_id = 1 "

result_1 <- dbGetQuery(db, query_1)
 > result_1
     sum
   1 100

query_2 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-09-05'
and o.variant_id = 2 "

result_2 <- dbGetQuery(db, query_2)
 > result_2
     sum
   1 120


query_3 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-05-21'
and o.variant_id = 3 "

result_3 <- dbGetQuery(db, query_3)
 > result_3
     sum
   1 140

...等等。

然后,我想将每个结果附加到新的数据帧results中:

results <- data.frame(
              variant_id = c(1, 2, 3, 4, 5),
                quantity = c(100, 120, 140, 150, 160)
           )

> results
  variant_id quantity
1          1      100
2          2      120
3          3      140
4          4      150
5          5      160

如何使用RPostgresdplyr解决这个问题,避免使用循环?

1 个答案:

答案 0 :(得分:0)

我们没有您的数据库,但使用了末尾注释中给出的parametersorders。我们在stringsAsFactors = FALSE定义中添加了parameters,以确保我们拥有字符串。

现在,使用sprintf创建查询的字符向量。然后运行每个。在这里,由于我们没有您的数据库,因此我们使用sqldf使所有内容都可重现,但是您可以使用适当的语句替换sqldf以从数据库中获取结果。

query <- "select sum(o.quantity)
  from orders o
  where o.date >= '%s'
  and o.variant_id = %s "

queries <- with(parameters, sprintf(query, start_date, variant_id))

library(sqldf)

# replace sqldf in next line with appropriate function to invoke query
do.call("rbind", lapply(queries, sqldf))
##   sum(o.quantity)
## 1               1
## 2              NA
## 3               3
## 4              NA
## 5              NA

注意

# test data

parameters <- data.frame(
   variant_id = c(1, 2, 3, 4, 5),
   start_date = c("2019-07-01", "2019-09-05", "2019-05-21", "2019-09-06",
                  "2019-04-19"), stringsAsFactors = FALSE)

orders <- data.frame(date = "2019-07-02", variant_id = 1:3, quantity = 1:3)