仍在弄清楚此Web抓取对象。尝试抓取HTTPS站点时遇到错误。与SSL证书有关,站点侧拒绝了我的连接吗?这是我的代码:
from bs4 import BeautifulSoup
import requests
import csv
with open('UrlsList.csv', newline='') as f_urls, open('Output.csv', 'w', newline='') as f_output:
csv_urls = csv.reader(f_urls)
csv_output = csv.writer(f_output)
for line in csv_urls:
page = requests.get(line[0], verify='.\Cert.cer').text
soup = BeautifulSoup(page, 'html.parser')
results = soup.findAll('td', {'class' :' alpha'})
for r in range(len(results)):
csv_output.writerow([results[r].text])
...这给了我一个大屏幕的问题,底部有以下错误:
raise exception_type(errors)
OpenSSL.SSL.Error: []
我也尝试仅将verify = False放置,这给了我以下错误:
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
我尝试自己研究答案,但到目前为止,我似乎还没有任何解决方案。我最近也将我的PyOpenSSL更新到了版本18。似乎我要抓取的网站不接受我的连接,但是URL是真实的,我可以从Chrome浏览器查看该网站没有问题吗?
非常感谢!