Question

我正在尝试自动化将文本从网站传递到工具的过程，以便估计文本的阅读水平。但是，当我通过post方法传递url编码的文本时，我收到错误400错误请求。

article = 'The quick brown fox jumps over the lazy dog.'
headers = ({'Host': 'auto-ilr.ll.mit.edu',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.5',
    'Accept-Encoding': 'gzip, deflate',
    'DNT': '1',
    'Referer': 'https://auto-ilr.ll.mit.edu/instant/',
    'Connection': 'keep-alive'})
s = requests.Session()
#s.mount('https://', SSLAdapter())
s.mount('https://', MyAdapter())
try:
    postdata = urllib.parse.urlencode({'Language': 'English', 'Text': article})
    soup = s.post('https://auto-ilr.ll.mit.edu/instant/summary3', data=postdata, headers = headers, verify=False)

我不确定区别是什么，但是有一些情况请求已经完成，最后的汤变量以网站上的文字结束，但是文字显示网站没有正确处理我收录的文字。

Answer 1

你缺少一些简单的东西，你不必编码data，requests为你做的事情：

article = 'The quick brown fox jumps over the lazy dog.'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0',  
    'Referer': 'https://auto-ilr.ll.mit.edu/instant/'  
}
postdata = {'Language': 'English', 'Text': article}
s = requests.Session()
soup = s.post('https://auto-ilr.ll.mit.edu/instant/summary3', data=postdata, headers = headers, verify=False)

print(soup.status_code)

此外，您不必发送所有标题，只需＆＃39; User-Agent＆＃39;或者＆＃39; Referer＆＃39;有时候

请求库

1 个答案: