证书验证失败(_ssl.c:645)>“对于一个特定域

时间:2016-08-19 20:44:56

标签: python python-3.x ssl web-crawler python-requests

对此特定域的每个请求现在都以证书验证失败(_ssl.c:645)>

我不确定是什么造成了这种情况。自昨晚以来我一直在寻找答案,试图弄清楚如何修复它,但不知怎的,我无法让它运行。

我试过pip uninstall -y certifi&& pip install certifi == 2015.04.28但它没有帮助。

这是我的代码:

def trade_spider(max_pages):
page = -1

partner_ID = 2
location_ID = 25


already_printed = set()

for page in range(0,20):
    response = urllib.request.urlopen("http://www.getyourguide.de/s/search.json?q=" + str(Region) +"&page=" + str(page))
    jsondata = json.loads(response.read().decode("utf-8"))
    format = (jsondata['activities'])
    g_data = format.strip("'<>()[]\"` ").replace('\'', '\"')
    soup = BeautifulSoup(g_data)



    hallo = soup.find_all("article", {"class": "activity-card activity-card-horizontal "})


    for item in hallo:
        headers = item.find_all("h3", {"class": "activity-card-title"})
        for header in headers:
            header_final = header.text.strip()
            if header_final not in already_printed:
                already_printed.add(header_final)



        prices = item.find_all("span", {"class": "price"})
        for price in prices:
            #itemStr += ("\t" + price.text.strip().replace(",","")[2:])
            price_final = price.text.strip().replace(",","")[2:]
            #if itemStr2 not in already_printed:
            #print(itemStr2)
                #already_printed.add(itemStr2)


        deeplinks = item.find_all("a", {"class": "activity-card-link"})
        for t in set(t.get("href") for t in deeplinks):
            #itemStr += "\t" + t
            deeplink_final = t
            if deeplink_final not in already_printed:
                #print(itemStr3)
                already_printed.add(deeplink_final)

        Language = "Deutsch"

        end_final = "Header: " + header_final + " | " + "Price: " + str(price_final) + " | " + "Deeplink: " + deeplink_final + " | " + "PartnerID: " + str(partner_ID) + " | " + "LocationID: " + str(location_ID)+ " | " + "Language: " + Language
        if end_final not in already_printed:
            print(end_final)
            already_printed.add(end_final)

trade_spider(INT(蜘蛛))

这是输出:

               Traceback (most recent call last):
   File "C:\Python34\lib\urllib\request.py", line 1240, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
   File "C:\Python34\lib\http\client.py", line 1083, in request
self._send_request(method, url, body, headers)
   File "C:\Python34\lib\http\client.py", line 1128, in _send_request
self.endheaders(body)
   File "C:\Python34\lib\http\client.py", line 1079, in endheaders
self._send_output(message_body)
   File "C:\Python34\lib\http\client.py", line 911, in _send_output
self.send(msg)
   File "C:\Python34\lib\http\client.py", line 854, in send
self.connect()
   File "C:\Python34\lib\http\client.py", line 1237, in connect
server_hostname=server_hostname)
   File "C:\Python34\lib\ssl.py", line 376, in wrap_socket
_context=self)
   File "C:\Python34\lib\ssl.py", line 747, in __init__
self.do_handshake()
   File "C:\Python34\lib\ssl.py", line 983, in do_handshake
self._sslobj.do_handshake()
   File "C:\Python34\lib\ssl.py", line 628, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "C:/Users/Rj/Desktop/ crawling                                        scripts/GetyourGuide_International_Final.py", line 84, in <module>
trade_spider(int(Spider))
File "C:/Users/Raju/Desktop/scripts/GetyourGuide_International_Final.py", line 36, in trade_spider
response = urllib.request.urlopen("http://www.getyourguide.com/s/search.json?q=" + str(Region) +"&page=" + str(page))
File "C:\Python34\lib\urllib\request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 471, in open
response = meth(req, response)
File "C:\Python34\lib\urllib\request.py", line 581, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python34\lib\urllib\request.py", line 503, in error
result = self._call_chain(*args)
File "C:\Python34\lib\urllib\request.py", line 443, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 686, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Python34\lib\urllib\request.py", line 465, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 483, in _open
'_open', req)
File "C:\Python34\lib\urllib\request.py", line 443, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1283, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Python34\lib\urllib\request.py", line 1242, in do_open
raise URLError(err)

urllib.error.URLError:

有人可以帮帮我吗?任何反馈都得到了认可:)

1 个答案:

答案 0 :(得分:-1)

我会通过检查openssl是否可以验证证书来进一步调查:

openssl s_client -showcerts -connect www.getyourguide.de:443