Question

我有一个postgresql数据库连接，想要从数据库中获取一个表。大概是将连接信息保存在不同文件中的好习惯吗？我刚才有两个文件：

#getthetable.R
library(tidyverse)
library(dbplyr)


## connect to db
con <- src_postgres(dbname = "thedbname",
                    host = "blablabla.amazonaws.com",
                    port = NULL,
                    user = "myname",
                    password = "1234")

thetable <- tbl(con, "thetable") %>% select(id, apples, carrots) %>% collect

然后：

#main.R
library(tidyverse)

## get data from getthetable script with connection
source("rscripts/getthetable.R")

这使得main.R中的con和thetable变量都可用。我只想要来自getthetable.R的变量thetable。我怎么做？离开con变量？

此外，在r中使用db连接时，是否有最佳实践？我的思维合乎逻辑吗？我做什么或做什么都有缺点，大多数人只是将连接与主脚本放在一起？

Answer 1

我也喜欢在不同的文件中捕获这些东西（比如连接），但也喜欢在这样的指定环境中捕获：

ConnectionManager <- local({

  con <- src_postgres(dbname = "thedbname",
                      host = "blablabla.amazonaws.com",
                      port = NULL,
                      user = "myname",
                      password = "1234")



  collectTable <- function() {

    tbl(con, "thetable") %>% select(id, apples, carrots) %>% collect

  }

  list(collectTable = collectTable)


})

这样，您在获取文件后只有一个对象ConnectionManager，并且可以使用ConnectionManager$collectTable()获取表格。此外，您可以轻松扩展它以获取其他表或包含一些连接实用程序功能。

仅从脚本中获取单个变量

1 个答案: