嗨,我想自动化一个从具有表单登录名的网站下载文件的过程 使用浏览器时,我可以在“请求Http标头”中看到一个cookie。为了成功获得授权,这似乎是必需的。否则我将出现401错误。 即使我发送了两次请求,也无法正常工作,因为第一个响应不包含所需的cookie。 任何建议,使用python从“请求Http标头”中获取cookie都是可行的。
要登录的网址: https://services.geoplace.co.uk/login
import mechanize
import cookielib
from bs4 import BeautifulSoup as bs
import html2text
import html5lib#
import sys
# Browser
br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-agent', 'Chrome')]
# The site we will navigate into, handling it's session
br.open('https://login.geoplace.co.uk/login')
# View available forms
for f in br.forms():
print "Formm " + str(f)
# Select the second (index one) form (the first form is a search query box)
br.select_form(nr=0)
# User credentials
br.form['username'] = 'myusername'
br.form['password'] = 'mypassword'
# Login
response = br.submit()
br.open('https://services.geoplace.co.uk')
request = br.request
print request.header_items()
# if successful we have some cookies now
cookies = br._ua_handlers['_cookies'].cookiejar
# convert cookies into a dict usable by requests
cookie_dict = {}
for c in cookies:
cookie_dict[c.name] = c.value
print cookie_dict
br.open('https://services.geoplace.co.uk/api/downloadMatrix/getFile?
fileName=30001_81s3.zip&fileType=LEVEL_3&fileVersion=May-
2020&sfAccountId=xxx')
答案 0 :(得分:0)
您上面提到的 api 支持 OAuth2(客户端、密码)授权类型。如果您向 GeoPlace 寻求帮助(通过此电子邮件 support@geoplace.co.uk) - 我们将根据您的请求创建客户端凭据,您应该能够访问它(我们还有其他实体以这种方式使用我们的服务)>
获得凭据后,步骤如下
curl 'https://login.geoplace.co.uk/oauth/token' -H "Authorization: Basic ZZZZZZZZZZZZZZZZZZZ" -d username='xxxxxxx' -d password='yyyyyyy' -d grant_type=password
(这将返回您的令牌信息)
使用上面的令牌执行 curl -H "Authorization: Bearer 66666-yyy-Ysdf-bb-xxxxx" -o 'FILE_NAME_TO_SAVE_IN_LOCAL.zip' 'https://services.geoplace.co.uk/api/downloadMatrix/getFile?fileName=30001_81s3.zip&fileType=LEVEL_3&fileVersion=May-2020&sfAccountId=xxx'