我有一个URL列表,我想循环浏览并抓取它们。我的代码仅对每个URL都能正常工作!虽然当我遍历列表时,它开始给我这个错误:
Traceback (most recent call last):
File "/path/x.py", line 41, in <module>
match = soup.find('div', id="threat").text
AttributeError: 'NoneType' object has no attribute 'text'
每次在列表中的随机URL中显示此错误。一旦通过第三个URL中的错误,则每秒一次,第五次。我单独检查了所有URL,并且代码没有问题并独自将其抓取!
有什么想法吗?
for i in range(len(lines)):
x = lines[i]
source_ = requests.get(x).text
soup = BeautifulSoup(source_, 'lxml')
match = soup.find('div', id="threat").text
with open('microsoftScrapped.txt', 'a') as out:
out.write('\n' + '\n' + '\t' + '\t' + '\t' + "====== " + " #: " + str(num) + " " + x + " ======" + '\n' + '\n')
out.write(match)
print(num)
print(x)
num+=1