未使用请求模块从HTTPS站点接收响应

时间:2016-02-25 14:26:36

标签: python python-2.7 https python-requests

我正在尝试访问

https://www.exploit-db.com/remote

使用python的请求模块,但是没有从页面获取响应。我想访问上面的所有链接。

mfun():
    response = requests.get('https://www.exploit-db.com/remote',verify=False)
    print(response.text)
    soup = bs4.BeautifulSoup(response.text)
    return [a.attrs.get('href') for a in soup.select('a[href^=/download/]')]

main():
    urls = myfun();
    for url in urls:
      response = requests.get(url)
      print(response.text)

我得到回应:

C:\Python27\requests\packages\urllib3\connectionpool.py:791: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)

1 个答案:

答案 0 :(得分:2)

该网站使用防火墙来查找“脚本化”的内容。访问。可以通过设置User-Agent标题来解决它;价值Mozilla/5.0似乎已足够:

headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://www.exploit-db.com/remote', headers=headers, verify=False)

请注意,生成的页面没有以download为前缀的网址;只有https://www.exploit-db.com/download。您可以调整^=前缀匹配,也可以改为使用*=download