我正在使用python创建一个字典,但随着我的数据变大,我开始出现内存错误,所以我认为我会节省内存而只是将数据写入数据库,但结果却不一样。我认为这与defaultdict的行为有关(但我不确定)。
这是工作的python代码(它基本上构建了一个值表):
from collections import defaultdict
data = [2,5,10]
target_sum = 100
# T[x, i] is True if 'x' can be solved
# by a linear combination of data[:i+1]
T = defaultdict(bool) # all values are False by default
T[0, 0] = True # base case
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
for c in range(s / x + 1):
if T[s - c * x, i]:
T[s, i+1] = True
#check the python dict results
count = 0
for x in T:
if T[x] == True:
print x, ':', T[x]
count = count +1
print 'total count is ', count
#False is 152 and True is 250. Total is: 402
结果是一个大的值表(你可以看到注释中的细分。这是我想要的正确结果),但当我更改第一个for语句的最后一行以添加到数据库而不是一个地方的词典,结果不同。
这是我修改后的代码有问题:
cursor = conn.cursor ()
cursor = conn.cursor ()
cursor.execute ("DROP TABLE IF EXISTS data_table")
cursor.execute ("""
CREATE TABLE data_table
(
value CHAR(80),
state BOOL
)
""")
#with database
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
for c in range(s / x + 1):
cursor.execute(""" SELECT value, state FROM data_table WHERE value='%s' """ % ([s - c * x, i]))
if cursor.rowcount == 0:
#print 'nothing found, adding'
cursor.execute (""" INSERT INTO data_table (value, state) VALUES ('%s', False)""" % ([s - c * x, i]))
elif cursor.rowcount == 1:
cursor.execute (""" UPDATE data_table SET state=True WHERE value = '%s'""" % ([s - c * x, i]))
#print 'record updated'
conn.commit()
#False is 17 and True is 286. Total is: 303
总结一下(如果你不想运行代码),defaultdict会在查询某些内容时创建一个错误的条目(在本例中为if T[s - c * x, i]:
)所以为了复制这个功能,我做了一个mysql查找值,如果它不存在,那么我创建它,如果它确实存在,那么我将其设置为true。我非常怀疑我无法正确复制功能
我唯一想到的是python将结果显示为(222, 0) : False
,但mysql正在做[222,0],不确定这是否有所作为。
答案 0 :(得分:1)
您的两个示例未更新相同的密钥:
# First example
if T[s - c * x, i]:
T[s, i+1] = True
# Key is (s, i+1)
# Second example
elif cursor.rowcount == 1:
cursor.execute (""" UPDATE data_table SET state=True WHERE value = '%s'""" % ([s - c * x, i]))
# Key is (s - c * x, i)
IMO将True案例存储在数据库中会更有意义,这可能会使您的程序更简单。否则,您还需要检查数据库中是否存在(s, i+1)
,如果存在则更新为True,否则创建新行。
P.S。我也错过了将(0, 0)
设置为True的命令。在创建数据库之后,不应该在插入中吗?
更新:您的代码中还发现了另一个问题:select命令只会检查行是否存在,而不是它的值是什么。要正确复制您的第一个示例,您的代码应为:
cursor.execute (""" INSERT INTO data_table (value, state) VALUES ('%s', True)""" % ([0, 0]))
conn.commit()
# Inserted the (0,0) case
for i, x in enumerate(data):
for s in range(target_sum + 1):
for c in range(s / x + 1):
cursor.execute(""" SELECT value, state FROM data_table WHERE value='%s' """ % ([s - c * x, i]))
if cursor.rowcount == 0:
cursor.execute (""" INSERT INTO data_table (value, state) VALUES ('%s', False)""" % ([s - c * x, i]))
elif cursor.rowcount == 1:
(value, state) = cursor.fetchone() # Gets the state
if state: # equivalent to your if in the first example
insertOrUpdate(conn, [s, i+1])
conn.commit()
更改了评论的行。
更新2 :这还不够......(正如我所说,如果您只存储了True值,那就更简单了)。为了便于阅读,将部分移到if
这里:
def insertOrUpdate(conn, key):
cursor.execute(""" SELECT value, state FROM data_table WHERE value='%s' """ % key)
if cursor.rowcount == 0:
# Insert as True if not exists
cursor.execute (""" INSERT INTO data_table (value, state) VALUES ('%s', True)""" % key)
elif cursor.rowcount == 1:
(value, state) = cursor.fetchone()
if !state:
# Update as True, if it was False
cursor.execute (""" UPDATE data_table SET state=True WHERE value = '%s'""" % key)
更新3:为了对比,只需存储True值,看看程序会更简单。它还使用更少的磁盘空间,花费更少的时间,并且表现得更像defaultdict。
cursor = conn.cursor ()
cursor.execute ("DROP TABLE IF EXISTS data_table")
cursor.execute ("""
CREATE TABLE data_table(
value CHAR(80)
)
""")
cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [0, 0])
conn.commit()
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
for c in range(s / x + 1):
cursor.execute(""" SELECT value FROM data_table WHERE value='%s' """ % ([s - c * x, i]))
if cursor.rowcount == 1:
cursor.execute(""" SELECT value FROM data_table WHERE value='%s' """ % [s, i+1])
if cursor.rowcount == 0:
cursor.execute (""" INSERT INTO data_table (value) VALUES ('%s')""" % [s, i+1])
conn.commit()