如果数据框接近可用的最大RAM数量,则在dbWriteTable
调用期间,如果数据框足够小以便加载到R仍然会偶尔达到内存限制上限。我想知道是否有更好的解决方案,而不是像下面的代码那样将表读入RAM?
我正在尝试编写可在旧计算机上运行的代码,因此我使用Windows 32位版本的R来重新创建这些内存错误。
# this example will only work on a computer with at least 3GB of RAM
# because it intentionally maxes out the 32-bit limit
# create a data frame that's barely fits inside 32-bit R's memory capacity
x <- mtcars[ rep( seq( nrow( mtcars ) ) , 400000 ) , ]
# check how many records this table contains..
nrow( x )
# create a connection to a SQLite database
# not stored in memory
library( RSQLite )
tf <- tempfile()
db <- dbConnect( SQLite() , tf )
# storing `x` in the database with dbWriteTable breaks.
# this line causes a memory error
# dbWriteTable( db , 'x' , x )
# but storing it in chunks works!
chunks <- 100
starts.stops <- floor( seq( 1 , nrow( x ) , length.out = chunks ) )
for ( i in 2:( length( starts.stops ) ) ){
if ( i == 2 ){
rows.to.add <- ( starts.stops[ i - 1 ] ):( starts.stops[ i ] )
} else {
rows.to.add <- ( starts.stops[ i - 1 ] + 1 ):( starts.stops[ i ] )
}
# storing `x` in the database with dbWriteTable in chunks works.
dbWriteTable( db , 'x' , x[ rows.to.add , ] , append = TRUE )
}
# and it's the correct number of lines.
dbGetQuery( db , "select count(*) from x" )