连接到Oracle DB时如何通过dbplyr使用EXTRACT

时间:2017-11-22 15:36:52

标签: r oracle dbplyr

采取此查询:

SELECT EXTRACT(month FROM order_date) "Month"
  FROM orders

(来自official oracle doc的简化示例)

您如何在EXTRACT链中整合上述dbplyr次操作?

我愿意接受任何其他解决方法(甚至是丑陋/昂贵的)来提取服务器端的月份。

2 个答案:

答案 0 :(得分:4)

更优雅:

tbl(con, "orders") %>% mutate(Month = extract(NULL %month from% order_date))

这导致以下SQL(ANSI SQL):

EXTRACT( MONTH FROM "order_date")

这个技巧很有效,因为运算符的名称(百分号之间的内容)直接翻译成SQL。 NULL消失(与NA不同)。

答案 1 :(得分:0)

与此同时,我想出了一些东西。

给定示例的预期输出将通过执行以下来获得:

con <- ROracle::dbConnect(drv, username, password, dbname) # your connection parameters
dplyr::tbl(con,"orders") %>%
  extract_o("Month","order_date",append = FALSE,force_upper_case = FALSE)

这是函数的代码,我包含了一些强制大写列的参数(默认)以及将新列附加到现有列(默认)。可以定义新列的名称,或者默认情况下将命名为要提取的值的类型。

#' use Oracle EXTRACT function
#' 
#' Will add a column to the table, containing extracted value,
#' optionally returns only this column
#' @param data tbl_lazy object
#' @param what type of data to extract
#' @param from column to extract from
#' @param new_col name of new column
#' @param append keep existing columns,
#' FALSE ditches them and keep only extracted column
#' @param force_upper_case make new column name uppercase
extract_o <-function(data, what, from, new_col = what,
                     append = TRUE,force_upper_case = TRUE) {
  allowed <- c("day","month","year","hour","minute","second",
                     "timezone_hour","timezone_minute",
                     "timezone_region","timezone_abbr")
  assertthat::assert_that(
    tolower(what) %in% allowed,
    msg=paste("Choose 'what' among",
              paste0("'",allowed,"'",collapse=", ")))
  if(force_upper_case) new_col <- toupper(new_col)
  tbl_query <- as.character(dbplyr::sql_render(data)) # previous query
  append_sql <- if(append)
    paste0(paste(colnames(data),collapse=", "),", ") else ""
  query <- paste0("SELECT ", append_sql,                         # initial cols or none
                  "EXTRACT(",what," FROM ",from,") \"",new_col,  # new col
                  "\" FROM (",tbl_query,")")                     # previous query
  dplyr::tbl(data$src$con,sql(query))
}