我正在研究一个脚本,以检查电子邮件地址列表,以查看是否已将其报告为受到破坏。返回的是json数据,它本质上是词典列表。
对于每个受感染的帐户,我想将键/值对“ Email”插入:在返回的词典列表中的每个词典中,然后将其导出到CSV文件。我目前在插入键/值对时遇到问题。
为便于阅读,返回的示例数据用换行符分隔:
[{'Name':'BTSec','Title':'Bitcoin Security Forum Gmail Dump','Domain':'forum.btcsec.com','BreachDate':'2014-01-09',' AdditionalDate':'2014-09-10T20:30:11Z','ModifiedDate':'2014-09-10T20:30:11Z','PwnCount':4789599,'Description':'2014年9月,将近500万个用户名和密码发布到了俄罗斯比特币论坛。虽然通常报告为5M“ Gmail密码”,但转储还包含123k yandex.ru地址。尽管违规的根源尚不清楚,但被入侵的凭证已被多个来源确认为正确,尽管已有数年之久。','LogoType':'svg','DataClasses':['电子邮件地址','密码' ],“ IsVerified”:true,“ IsFabricated”:False,“ IsSensitive”:False,“ IsRetired”:False,“ IsSpamList”:False}
{'Name':'ExploitIn','Title':'Exploit.In','Domain':'','BreachDate':'2016-10-13','AddedDate':'2017-05- 06T07:03:18Z','ModifiedDate':'2017-05-06T07:03:18Z','PwnCount':593427119,'Description':'2016年末,电子邮件地址和密码对的巨大列表出现在“组合列表”称为“ Exploit.In”。该列表包含5.93亿个唯一的电子邮件地址,其中许多具有来自各种在线系统的多种不同密码。该列表被广泛分发并用于“凭据填充”,也就是说,攻击者利用它来尝试识别帐户所有者重用其密码的其他在线系统。有关此事件的详细背景,请阅读“我已被伪造”中的密码重用,凭据填充和另外十亿条记录。”,“ LogoType”:“ svg”,“ DataClasses”:[“电子邮件地址”,“密码”],“ IsVerified ':False,'IsFabricated':False,'IsSensitive':False,'IsRetired':False,'IsSpamList':False}
{'名称':'LinkedIn','标题':'LinkedIn','域名':'linkedin.com','BreachDate':'2012-05-05','AddedDate':'2016-05 -21T21:35:40Z'','ModifiedDate':'2016-05-21T21:35:40Z','PwnCount':164611595,'Description':'2016年5月,LinkedIn公开了1.64亿个电子邮件地址和密码。该数据最初于2012年被黑客入侵,直到4年后在暗市场上出售之前一直不可见。突破口中的密码存储为不含盐的SHA1散列,其中绝大多数在数据发布后的几天内迅速被破解。','LogoType':'svg','DataClasses':['电子邮件地址' ,'Passwords'],'IsVerified':True,'IsFabricated':False,'IsSensitive':False,'IsRetired':False,'IsSpamList':False}]
这是我当前的代码:
def main():
if address != "None":
checkAddress(address)
elif filename != "None":
email = [line.rstrip('\n') for line in open(filename)] # strip the newlines
for email in email:
checkAddress(email)
else:
for email in lstEmail:
checkAddress(email)
def checkAddress(email):
sleep = rate # Reset default acceptable rate
check = requests.get("https://" + server + "/api/v2/breachedaccount/" + email + "?includeUnverified=true",
headers = headers,
proxies = proxies,
verify = sslVerify)
if str(check.status_code) == "404": # The address has not been breached.
print (OKGREEN + "[i] " + email + " has not been breached." + ENDC)
time.sleep(sleep) # sleep so that we don't trigger the rate limit
return False
elif str(check.status_code) == "200": # The address has been breached!
print (FAILRED + "[!] " + email + " has been breached!" + ENDC)
data = (check.json())
for i in data:
data[i].append( [{'test':'test'}])
print (i)
print ('\n') # Temp \n for readability
time.sleep(sleep) # sleep so that we don't trigger the rate limit
return True
这是我当前遇到的错误:
[!] j.doe@gmail.com has been breached!
Traceback (most recent call last):
File "hibp2csv.py", line 95, in <module>
main()
File "hibp2csv.py", line 52, in main
checkAddress(email)
File "hibp2csv.py", line 70, in checkAddress
data[i].append( [{'test':'test'}])
TypeError: list indices must be integers or slices, not dict
答案 0 :(得分:1)
您会收到此错误,因为您正在使用元素而不是索引遍历列表data
,因此i
将在每次迭代中成为data
的每个元素,并且当您致电data[i]
时,您传递的是字典而不是索引。
要解决此问题,您只需将for
语句修改为for i in range(len(data))
。
另外,使用dict.update({"key": "value"})
更新字典。