如何使用Perl和Postgresql遍历大型结果集

时间:2018-12-17 17:38:29

标签: postgresql perl large-data dbi database-cursor

Perl的DBD::Pg PostgreSQL绑定将始终获取查询的整个结果集。因此,如果您使用简单的执行准备过程来遍历一个大表,则只需运行$sth->execute()就可以将整个表存储在内存中。诸如fetch_row之类的准备好的语句和调用无济于事。

如果您使用的是BIG表,那么以下操作将很失败。

use DBI;
my $dbh =   DBI->connect("dbi:Pg:dbname=big_db","user","password",{
        AutoCommit => 0,
        ReadOnly => 1,
        PrintError => 1,
        RaiseError =>  1,
});

my $sth = $dbh->prepare('SELECT * FROM big_table');
$sth->execute(); #prepare to run out of memory here
while (my $row = $sth->fetchrow_hashref('NAME_lc')){
  # do something with the $row hash
}
$dbh->disconnect();

1 个答案:

答案 0 :(得分:6)

要变通解决此问题,声明一个游标。然后使用游标获取数据块。 ReadOnly和AutoCommit设置对于此功能很重要。由于PostgreSQL仅会执行CURSORS进行读取。

use DBI;
my $dbh =   DBI->connect("dbi:Pg:dbname=big_db","user","password",{
        AutoCommit => 0,
        ReadOnly => 1,
        PrintError => 1,
        RaiseError =>  1,
});

$dbh->do(<<'SQL');
DECLARE mycursor CURSOR FOR
SELECT * FROM big_table
SQL

my $sth = $dbh->prepare("FETCH 1000 FROM mycursor");
while (1) {
  warn "* fetching 1000 rows\n";
  $sth->execute();
  last if $sth->rows == 0;
  while (my $row = $sth->fetchrow_hashref('NAME_lc')){
    # do something with the $row hash
  }
}
$dbh->disconnect();