如何使用dplyr创建BigQuery tbl

时间:2014-05-15 03:19:40

标签: r google-bigquery dplyr

(编辑以尝试制作可重复的示例)

我尝试通过dplyr(和依赖项)连接到BigQuery,并且我收到错误。我做错了什么?

require(dplyr)   #installed from cran
devtools::install_github("assertthat")
devtools::install_github("bigrquery")
require(bigrquery)

billing_project = "omitted"

sql <- "SELECT year, month, day, weight_pounds FROM natality LIMIT 5"
query_exec("publicdata", "samples", sql, billing = billing_project)

# returns
# Auto-refreshing stale OAuth token.
# year month day weight_pounds
# 1 1969     1   2      8.999270
# 2 1969     1  15      8.375361
# 3 1969     1  27      9.124933
# 4 1969     1   9      6.000983
# 5 1969     1  25      7.561856

bq_db = src_bigquery("publicdata","samples", billing=billing_project)
bq_db

# returns
# src:  bigquery [publicdata/samples]
# tbls: github_nested, github_timeline, gsod, natality, shakespeare, trigrams, wikipedia


tri=tbl(bq_db, "trigrams")

# returns
# Error in UseMethod("sql_select") : 
# no applicable method for 'sql_select' applied to an object of class "bigquery"

这是我遇到问题的最后一个错误。

我最初从cran安装了dplyr,版本为0.1.3。 bigrquery包是从github通过devtools安装的,它是0.1。

sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bigrquery_0.1 dplyr_0.1.3  

loaded via a namespace (and not attached):
 [1] assertthat_0.1.0.99 devtools_1.5        digest_0.6.4        evaluate_0.5.3      httpuv_1.3.0       
 [6] httr_0.3            jsonlite_0.9.7      memoise_0.2.1       parallel_3.1.0      Rcpp_0.11.1        
[11] RCurl_1.95-4.1      stringr_0.6.2       tools_3.1.0         whisker_0.3-2  

我也试过这个:

sql_q <- "SELECT year, month, day, weight_pounds FROM natality LIMIT 5"
tri=tbl(bq_db, sql(sql_q))
# results in
# Error in UseMethod("qry_fields") : 
#  no applicable method for 'qry_fields' applied to an object of class "bigquery"

1 个答案:

答案 0 :(得分:0)

我通常这样做:

df <- tbl_df(query_exec(sql, project, max_pages = Inf))