使用dbWriteTable在SQLite数据库中保存R数据表时避免达到内存限制的想法

时间:2013-01-09 21:18:39

标签: r sqlite sqldf

如果数据框接近可用的最大RAM数量,则在dbWriteTable调用期间,如果数据框足够小以便加载到R仍然会偶尔达到内存限制上限。我想知道是否有更好的解决方案,而不是像下面的代码那样将表读入RAM?

我正在尝试编写可在旧计算机上运行的代码,因此我使用Windows 32位版本的R来重新创建这些内存错误。

# this example will only work on a computer with at least 3GB of RAM
# because it intentionally maxes out the 32-bit limit

# create a data frame that's barely fits inside 32-bit R's memory capacity
x <- mtcars[ rep( seq( nrow( mtcars ) ) , 400000 ) , ]

# check how many records this table contains..
nrow( x )

# create a connection to a SQLite database
# not stored in memory
library( RSQLite )
tf <- tempfile()
db <- dbConnect( SQLite() , tf )


# storing `x` in the database with dbWriteTable breaks.
# this line causes a memory error
# dbWriteTable( db , 'x' , x )

# but storing it in chunks works!
chunks <- 100

starts.stops <- floor( seq( 1 , nrow( x ) , length.out = chunks ) )


for ( i in 2:( length( starts.stops ) )  ){

    if ( i == 2 ){
        rows.to.add <- ( starts.stops[ i - 1 ] ):( starts.stops[ i ] )
    } else {
        rows.to.add <- ( starts.stops[ i - 1 ] + 1 ):( starts.stops[ i ] )
    }

    # storing `x` in the database with dbWriteTable in chunks works.
    dbWriteTable( db , 'x' , x[ rows.to.add , ] , append = TRUE )
}


# and it's the correct number of lines.
dbGetQuery( db , "select count(*) from x" )

0 个答案:

没有答案