我是Python的新手,所以请耐心等待,但我已经尝试创建一个脚本,如果我还没有它,则可以获取单词的同义词,并将其以JSON格式添加到我的字典中。
这是我的代码:
import json, sys, urllib
from urllib.request import urlopen
f = open('dict.json', 'r')
string = json.loads(f.read())
tempString = string
url = 'http://words.bighugelabs.com/api/2/myapicode/%s/json'
def main():
crawl()
def crawl():
for a in string:
for b in string[a]:
for c in string[a][b]:
for d in string[a][b][c]:
if not isInDict(d):
addWord(d, getWord(url % d))
else:
print('[-] Ignoring ' + d)
f.seek(0)
f.write(tempString)
f.truncate()
f.close()
def isInDict(value):
for x in list(tempString.keys()):
if x == value:
return True
return False
def getWord(address):
try:
return urlopen(address).read().decode('utf-8')
except:
print('[!] Failed to get ' + address)
return ''
def addWord(word, content):
if content != None and content != '':
print('[+] Adding ' + word)
tempString[word] = content
else:
print('[!] Ignoring ' + word + ': content empty')
if __name__ == '__main__':
main()
跑步的时候,它可以正常工作直到'amour'并且它给了我这个:
working fine
[+] Adding sex activity
[+] Adding sexual activity
[+] Adding sexual desire
[+] Adding sexual practice
[-] Ignoring amour
Traceback (most recent call last):
File "crawler.py", line 47, in <module>
main()
File "crawler.py", line 10, in main
crawl()
File "crawler.py", line 13, in crawl
for a in string:
RuntimeError: dictionary changed size during iteration
但我没有看到我在string
和tempString
只更改任何内容......
PS:如果你想要我读过的JSON数据:
{
"love": {
"noun": {
"syn": ["passion", "beloved", "dear", "dearest", "honey", "sexual love", "erotic love", "lovemaking", "making love", "love life", "concupiscence", "emotion", "eros", "loved one", "lover", "object", "physical attraction", "score", "sex", "sex activity", "sexual activity", "sexual desire", "sexual practice"],
"ant": ["hate"],
"usr": ["amour"]
},
"verb": {
"syn": ["love", "enjoy", "roll in the hay", "make out", "make love", "sleep with", "get laid", "have sex", "know", "do it", "be intimate", "have intercourse", "have it away", "have it off", "screw", "jazz", "eff", "hump", "lie with", "bed", "have a go at it", "bang", "get it on", "bonk", "copulate", "couple", "like", "mate", "pair"],
"ant": ["hate"]
}
}
}
答案 0 :(得分:6)
在这一行:
string = json.loads(f.read())
tempString = string
您指定tempString
以引用与string
相同的字典对象。然后,在addWord
中,您更改tempString
:
tempString[word] = content
因为tempString只是与string
相同的字典对象的另一个引用,所以它也会更改string
。
要避免这种情况,请使用:
import copy
tempString = copy.deepcopy(string)
此外,使用string
之类的变量名称(也是内置函数的名称)通常是一种不好的做法。它不是非常具有描述性,当名称在范围内时,它将使您无法方便地访问内置函数。
答案 1 :(得分:0)
让我们举一个例子:
>>> for i in d:
... if d[i] == 2:
... d.pop(i)
...
2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
要解决这个问题,可以采取以下措施:
>>> for i in d.keys():
... if d[i] == 2:
... d.pop(i)
...
>>> d
{'one': 1}
因此,对于您的特定代码:
尝试改变这一点:
def crawl():
for a in string:
为:
def crawl():
for a in string.keys():
如果这不起作用,我会在今天晚些时候更深入地查看您的代码。