无法直接访问网站的内部网址

时间:2018-03-02 21:40:32

标签: python http http-headers httpsession

我看到有一些网址为浏览器提取元数据(json)以呈现网站,即当我点击example.com时,在Firefox's developers view - > Network tab;像https://example.com/server/getdata?cmd=showResults这样的网址。

所以,我的问题是我可以在同一个firefox窗口的新选项卡中访问该URL(预期的json数据)。但我无法访问其他firefox窗口中的相同URL(重新调整空json)。它维持某种会话(可能与cookie?)。我从开发人员视图中复制了完全相同的http标头值,并在那时用request创建了python脚本进行测试。但python脚本重新调整空json

示例屏幕截图

Request URL

Python代码

parameter = {
    "Accept": "application/json, text/javascript, */*; q=0.01",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "en-US,en;q=0.5",
    "Cache-Control": "no-store, max-age=0",
    "Connection": "keep-alive",
    "Content-Length": "13175",
    "Content-Type": "application/x-www-form-urlencoded",
    "DNT": "1",
    "Host": "in.example.com",
    "Cookie": '__cfduid=xxxxxxxxxxxxxxxxx; __cfruid=xxxxxxx-1520022406; mqttuid=1.361660689',
    "Referer": "https://in.example.com/page1/page2",
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0",
    "X-Requested-With": "XMLHttpRequest"
}
response = requests.post(url="https://in.example.com/serv/getData?cmd=XXXX&type=XX&XXXX=1&_=1520022652009", data=parameter)
#print(dir(response))
print(response.headers)
print(response.json())

如何在不点击根网站的情况下模拟会话并直接点击URL?

PS:该网站是静态网站

UPDATE1

更改了header=parameters

response = requests.get(url="https://in.example.com/server1/getallData?cmd=xxxx&_=1520097652234", headers=parameter)

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='in.example.com', port=443): Max retries exceeded with url: /server1/getallData?cmd=xxxxx&_=1520097652934 (Caused by <class 'ConnectionResetError'>: [Errno 104] Connection reset by peer)

获得Connection Reset例外。看起来CF正在做点什么?任何想法?

1 个答案:

答案 0 :(得分:0)

您将“标题”作为data传递?

您应该使用headers代替。

headers = {'User-Agent': 'Bot'}
requests.get('example.com/params', headers=headers)
相关问题