通过Python访问响应头位置

时间:2017-07-09 22:08:07

标签: python ajax web

我目前正在尝试访问从GET请求到网址的响应标头中的位置字段:https://dbr.ee/aUJA/d?。目前,我已经能够通过这个Python代码查看位置字段:

import requests
r = requests.get('hhttps://dbr.ee/aUJA/d?', allow_redirects=False, headers={'User-Agent': 'Mozilla/5.0'})
print r.headers

但是输出是错误的位置字段

  

{'状态':'找到'302','X-Request-Id':   '9e968067-1bee-4cc9-9305-19d45d5cb6ea','X-XSS-Protection':'1;   mode = block','X-Content-Type-Options':'nosniff','Transfer-Encoding':   'chunked','Set-Cookie':   “__cfduid = d21c538fd46c153a046bf461ca281978d1499637583;期满=星期一,   09-Jul-18 21:59:43 GMT;路径= /;域= .dbr.ee;仅Http,   ahoy_visitor = f4f1c08c-add3-45c0-8325-675b1caf3048;路径= /;   expires =星期二,09 Jul 2019 21:59:44 -0000,   ahoy_visit = cdbb4ca8-3272-473c-8562-03596d88ec0f;路径= /;期满=星期一,   2017年7月10日01:59:44 -0000,ahoy_track = true; path = /,SERVERID =;   Expires = Thu,01-Jan-1970 00:00:01 GMT; path = /','X-Runtime':   '0.006820','服务器':'cloudflare-nginx','连接':'保持活力',   'Location':'hhttps://dbr.ee/aUJA','Cache-Control':'no-cache',   '日期':'太阳,2017年7月9日21:59:44 GMT','X-Frame-Options':   'SAMEORIGIN','Content-Type':'text / html; charset = utf-8','CF-RAY':   '37be8d52fdc83822-ATL'}

这是:

  

'位置':'hhttps://dbr.ee/aUJA'

在网站上,实际的响应标头是这个(通过Chrome开发者工具查看)

  

cache-control:no-cache cf-ray:37be8bacacb437d4-ATL   内容类型:文本/ HTML; charset = utf-8 date:Sun,09 Jul 2017 21:58:36   格林威治标准时间   位置:hhttps://s.dbr.ee/sffc/python%2Dlogo%2Dmaster%2Dv3%2DTM.png.zip temp_url_sig = 41ebabb749293a6fe3f3ec82c5ab8ec01b0ed053&安培; temp_url_expires = 1499637816&安培;文件名=蟒-标志主-V3-TM.png。拉链;&安培;附接   服务器:CloudFlare的,nginx的   的Set-Cookie:ahoy_visit = f7d15e42-155c-443f-a637-22c3681863a5;路径= /;   到期=周一,2017年7月10日01:58:36 -0000   的Set-Cookie:_dbree_session = U2x6akdCbUJ4c28wdW9MeUFYOXo1QUVxLzV3ZVNxcGtTWW1jbVdkWEdPOWZPMWFiOEl4M0VWY1dOWGNYTjNubEJoVWJHejRCTlQwQlkwL0UrM09QallTMzhFZlU3RFBBTDZxaW9xcGRMeXNlQS9mZFByYTZQWTM0ZlBHMU50ekhhTkt1bjZENXJHRnc2a3dWeGY2d3BBPT0tLVNKOTJnL0Q3SjloWEc0MTZqTnRPNFE9PQ%3D%3D - 2dd8f3e77a673f385c9a231af426b55f1d1f71c0;   域= dbr.ee;路径= /; HttpOnly set-cookie:SERVERID =;过期=星期四,   1970年1月1日00:00:01 GMT; path = / status:302状态:302找到   x-content-type-options:nosniff x-frame-options:SAMEORIGIN   x-request-id:f57f3ca7-c7aa-4449-a2d7-7b5014010d0f x-r​​untime:0.015892   X-XSS-保护:1;模式=块

位置

  

位置:hhttps://s.dbr.ee/sffc/python%2Dlogo%2Dmaster%2Dv3%2DTM.png.zip temp_url_sig = 41ebabb749293a6fe3f3ec82c5ab8ec01b0ed053&安培; temp_url_expires = 1499637816&安培;文件名=蟒-标志主-V3-TM .png.zip;&安培;附接

我试图在Python中搜索的下载链接。单击“直接下载”按钮后,它将显示在“开发人员工具”中。

如何让标题在Python中显示正确的字段位置?

*链接已在http前面用h修改,因为我不允许发布超过2个链接,但这对于问题的上下文是必要的

1 个答案:

答案 0 :(得分:1)

看起来问题是缺少的referer标头。一旦我将其添加到您的代码中,我就会获得相应的302重定向响应,并使用正确的# Create Helper Function >>> import subprocess >>> run_commandline = lambda cmd: subprocess.check_output(cmd, shell=True).decode() # Doctestable command-line calls >>> print(run_commandline('cal 7 2017')) July 2017 Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 >>> print(run_commandline('echo $BASH_VERSION')) 3.2.57(1)-release 标题:

Location

产生:

  

{'Date':'Sun,09 Jul 2017 23:44:55 GMT','Content-Type':'text / html; charset = utf-8','Transfer-Encoding':'chunked','Connection':'keep-alive','Set-Cookie':'__ cfuid = d071cba66cc515ca7f2bc620362c6d46d1499643895;到期=周一,09-Jul-18 23:44:55 GMT;路径= /;域= .dbr.ee; HttpOnly,ahoy_visitor = 64d9f580-781e-4037-8951-ce57b73df720;路径= /; expires = Tue,09 Jul 2019 23:44:55 -0000,ahoy_visit = 802132cc-4e0e-4089-9be5-49f05223f567;路径= /; expires =星期一,2017年7月10日03:44:55 -0000,SERVERID =; Expires = Thu,01-Jan-1970 00:00:01 GMT; path = /','Status':'302 Found','Cache-Control':'no-cache','X-XSS-Protection':'1; mode = block','X-Request-Id':'14a0d0df-c14d-477d-b87c-b6edb823619c','Location':'https://s.dbr.ee/sffc/python%2Dlogo%2Dmaster%2Dv3%2DTM.png.zip?temp_url_sig=084b2b71c8c12df993d528e991a5b44e46e974ef&temp_url_expires=1499644195&filename=python-logo-master-v3-TM.png.zip;&attachment','X-Runtime': '0.006968','X-Frame-Options':'X-Content-Type-Options':'nosniff','Server':'cloudflare-nginx','CF-RAY':'37bf2769fda80fa5-YYZ “}