字典在迭代期间改变了大小,但我没有看到我在哪里改变它

时间:2013-12-05 22:37:41

标签: python json python-3.x dictionary

我是Python的新手,所以请耐心等待,但我已经尝试创建一个脚本,如果我还没有它,则可以获取单词的同义词,并将其以JSON格式添加到我的字典中。

这是我的代码:

import json, sys, urllib
from urllib.request import urlopen

f = open('dict.json', 'r')
string = json.loads(f.read())
tempString = string
url = 'http://words.bighugelabs.com/api/2/myapicode/%s/json'

def main():
    crawl()

def crawl():
    for a in string:
        for b in string[a]:
            for c in string[a][b]:
                for d in string[a][b][c]:
                    if not isInDict(d):
                        addWord(d, getWord(url % d))
                    else:
                        print('[-] Ignoring ' + d)
    f.seek(0)
    f.write(tempString)
    f.truncate()
    f.close()

def isInDict(value):
    for x in list(tempString.keys()):
        if x == value:
            return True
    return False

def getWord(address):
    try:
        return urlopen(address).read().decode('utf-8')
    except:
        print('[!] Failed to get ' + address)
    return ''

def addWord(word, content):
    if content != None and content != '':
        print('[+] Adding ' + word)
        tempString[word] = content
    else:
        print('[!] Ignoring ' + word + ': content empty')

if __name__ == '__main__':
    main()

跑步的时候,它可以正常工作直到'amour'并且它给了我这个:

working fine
[+] Adding sex activity
[+] Adding sexual activity
[+] Adding sexual desire
[+] Adding sexual practice
[-] Ignoring amour
Traceback (most recent call last):
  File "crawler.py", line 47, in <module>
    main()
  File "crawler.py", line 10, in main
    crawl()
  File "crawler.py", line 13, in crawl
    for a in string:
RuntimeError: dictionary changed size during iteration

但我没有看到我在stringtempString只更改任何内容......

PS:如果你想要我读过的JSON数据:

{
    "love": {
        "noun": {
            "syn": ["passion", "beloved", "dear", "dearest", "honey", "sexual love", "erotic love", "lovemaking", "making love", "love life", "concupiscence", "emotion", "eros", "loved one", "lover", "object", "physical attraction", "score", "sex", "sex activity", "sexual activity", "sexual desire", "sexual practice"],
            "ant": ["hate"],
            "usr": ["amour"]
        },
        "verb": {
            "syn": ["love", "enjoy", "roll in the hay", "make out", "make love", "sleep with", "get laid", "have sex", "know", "do it", "be intimate", "have intercourse", "have it away", "have it off", "screw", "jazz", "eff", "hump", "lie with", "bed", "have a go at it", "bang", "get it on", "bonk", "copulate", "couple", "like", "mate", "pair"],
            "ant": ["hate"]
        }
    }
}

2 个答案:

答案 0 :(得分:6)

在这一行:

string = json.loads(f.read())
tempString = string

您指定tempString以引用与string相同的字典对象。然后,在addWord中,您更改tempString

    tempString[word] = content

因为tempString只是与string相同的字典对象的另一个引用,所以它也会更改string

要避免这种情况,请使用:

import copy
tempString = copy.deepcopy(string)

此外,使用string之类的变量名称(也是内置函数的名称)通常是一种不好的做法。它不是非常具有描述性,当名称在范围内时,它将使您无法方便地访问内置函数。

答案 1 :(得分:0)

让我们举一个例子:

>>> for i in d:
...     if d[i] == 2:
...         d.pop(i)
...
2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

要解决这个问题,可以采取以下措施:

>>> for i in d.keys():
...     if d[i] == 2:
...         d.pop(i)
...
>>> d
{'one': 1}

因此,对于您的特定代码:

尝试改变这一点:

def crawl():
    for a in string:

为:

def crawl():
    for a in string.keys():

如果这不起作用,我会在今天晚些时候更深入地查看您的代码。