在R中使用数据库我观察到以下情况:
library(dplyr)
library(RSQLite)
library(dbplyr)
## Creating data
d <- data.frame(x=1:100)
## Setting up a sqlite database
dv <- dbDriver("SQLite")
con <- dbConnect(dv, dbname="test.db")
## Writing the data to the db
dbWriteTable(con, "data", d,header=T)
## Reading the data back
d.tbl = tbl(con,"data" )
## Apply a filter and collect the data
d.tbl <- d.tbl %>% filter(x<5)
d.coll <- collect(d.tbl)
## Save the filtered data into another database
con2 <- dbConnect(dv, dbname="test2.db")
dbWriteTable(con2, "data", d.coll, header=T)
## Reading it again to make sure the filter was applied
dx1 <- dbReadTable(con, "data")
dx2 <- dbReadTable(con2, "data")
像魅力一样工作。 看看我看到的两个db文件:
-rw-r--r-- 1 mycomp staff 2048 19 Oct 23:24 test.db
-rw-r--r-- 1 mycomp staff 2048 19 Oct 23:24 test2.db
它们的大小完全一样...... 即使test2.db包含完整数据和查询,它至少还包含test.db之外的查询。我确保已应用过滤器。
任何启示?