伪造HTTP获取下载图像的请求

时间:2018-04-29 22:44:53

标签: http request

尝试从此网站http://traffic.ottawa.ca/map/camera?id=95获取图片。但无论我如何尝试,总是获得Access Denied Image而不是真实的

以下是我的尝试:

import shutil
import requests
from fake_useragent import UserAgent
ua = UserAgent()

header = {'User-Agent': str(ua.chrome)}
url = "http://traffic.ottawa.ca/map/camera?id=95"
response = requests.get(url, headers=header, stream=True)

with open('img1.png', 'wb') as out_file:
    shutil.copyfileobj(response.raw, out_file)

还尝试使用Selenium

from selenium import webdriver
from selenium.webdriver import ActionChains, DesiredCapabilities
from selenium.webdriver.common.keys import Keys

url = 'http://traffic.ottawa.ca/map/camera?id=95'

desired_capabilities = DesiredCapabilities.CHROME.copy()
desired_capabilities['chrome.page.customHeaders.User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'
driver = webdriver.Chrome(desired_capabilities=desired_capabilities)
driver.get(url)
actionChains = ActionChains(driver)

actionChains.key_down(Keys.CONTROL).send_keys('S').key_up(Keys.CONTROL)
actionChains.perform()

伪造请求的正确方法是什么,我认为这完全是关于用户代理,但显然不是?

2 个答案:

答案 0 :(得分:0)

它似乎是通过网站设置的cookie阻止的。以下工作(或者每当您点击它时可能不会):

curl -v --cookie "JSESSIONID=0416883CFBE4DAB71539DCCCA05C584D" http://traffic.ottawa.ca/map/camera\?id\=2025 -o image

您可以从开发工具中查看JSESSIONID的值。现在您知道问题所在,您可以使用您选择的解决方案来使用您选择的库检索有效的cookie。

答案 1 :(得分:0)

正如 Slava Knyazev 所解释的,添加一个cookie,解决了这个问题。这是最终的代码

import shutil
import requests
from fake_useragent import UserAgent
ua = UserAgent()

header = {'User-Agent': str(ua.chrome)}
url = "http://traffic.ottawa.ca/map/camera?id=95"
cookie = {'JSESSIONID': '0416883CFBE4DAB71539DCCCA05C584D'}

response = requests.get(url, headers=header, stream=True, cookies=cookie)

with open('img1.png', 'wb') as out_file:
    shutil.copyfileobj(response.raw, out_file)