Python Psycopg2 For循环,大型数据库问题

时间:2014-05-27 16:23:40

标签: python postgresql psycopg2

我试图用psycopg2和python循环一个大的8gb数据库。我已按照文档操作,并收到错误消息。我试图在不使用.fetchall()的情况下遍历数据库的每一行,因为它只是将它全部提取到内存中。您不能使用fetchone(),因为它会单独获取每个列。

请注意,第一次通过它会返回一个值,第二次通过它会给出错误。

文档内容如下:

Note cursor objects are iterable, so, instead of calling explicitly fetchone() in a loop, the object itself can be used:
>>> cur.execute("SELECT * FROM test;")
>>> for record in cur:
...     print record
...
(1, 100, "abc'def")
(2, None, 'dada')
(3, 42, 'bar')

我的代码是:

statement = ("select source_ip,dest_ip,bytes,datetime from IPS")
cursor.execute(statement)

for sip,dip,bytes,datetime in cursor:
    if sip in cidr:
        ip = sip
        in_bytes = bytes
        out_bytes = 0
        time = datetime
    else:
        ip = dip
        out_bytes = bytes
        in_bytes = 0
        time = datetime    
    cursor.execute("INSERT INTO presum (ip, in_bytes, out_bytes, datetime) VALUES (%s,%s,%s,%s);", (ip, in_bytes, out_bytes, time,))
    conn.commit()
    print "writing to presum"

我收到以下错误:

用于sip,dip,bytes,datetime in cursor: psycopg2.ProgrammingError:无法获取结果

3 个答案:

答案 0 :(得分:1)

看起来你正在将一个元组传递给cursor.execute。尝试传递要运行的sql字符串。

statement = "select source_ip,dest_ip,bytes,datetime from IPS"
cursor.execute(statement)

答案 1 :(得分:1)

您正在更改循环内的结果集

cursor.execute("INSERT INTO presum (ip, in_bytes, out_bytes, datetime) VALUES (%s,%s,%s,%s);", (ip, in_bytes, out_bytes, time,))

而是在sql中完成所有操作

statement = """
    insert into presum (ip, in_bytes, out_bytes, datetime)

    select source_ip, bytes, 0, datetime
    from IPS
    where source_ip << %(cidr)s

    union all

    select dest_ip, 0, bytes, datetime
    from IPS
    where not source_ip << %(cidr)s
"""

cidr = IP('200.90.230/23')

cursor.execute(statement, {'cidr': cidr.strNormal()})
conn.commit()

我假设source_ip的类型为inet<<运算符检查子网中是否包含inet地址

答案 2 :(得分:0)

我对这个问题很感兴趣。我想也许你可以做的是使用cursor.fetchmany(size)。例如:

cursor.execute("select * from large_table")

# Set the max number of rows to fetch at each iteration
max_rows = 100
while 1:
  rows = cursor.fetchmany(max_rows)
  if len(rows) == 0:
     break
  else:
     for arow in rows:
        # do some processing of the row

也许那会对你有用吗?