Python:如何在不更改User-Agent的情况下遍历POST请求?

时间:2019-06-09 14:26:50

标签: python beautifulsoup python-requests

我正在编写一个脚本,该脚本登录到多个相同类型的论坛(phpBB)。当我遍历URL列表时,大多数URL仅给我我想要来自不同用户代理的响应。

其余的URL通常给出200的响应代码,但返回如下所示:

<div class="error">The submitted form was invalid. Try submitting again.</div> <dl>

我已经测试了用户代理,其中显示的网址注释如下:

def auth_check():
    errorlist = []
    windowsnt_usera = []
    with open('other_phpBB_still_errors5.txt', "r") as f:
        for item in f:
            try:
                item2 = item.strip()
                #headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A'}
                #headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0'}
                #headers = {'User-Agent': 'Mozilla Firefox Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:53.0) Gecko/20100101 Firefox/53.0'}
                #headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 7.0; BLL-L22 Build/HUAWEIBLL-L22)'
                 #                         ' AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.91 Mobile Safari/537.36'}
                #headers = {'User-Agent': 'Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148'}
                headers = {'User-Agent': 'Opera/9.80 (Windows NT 6.1; WOW64) Presto/2.12.388 Version/12.18'}
                s = requests.Session()
                r = s.get(item2, headers=headers)
                soup = BeautifulSoup(r.text, "html.parser")
                creation = soup.find('input', {'name': 'creation_time'})['value']
                token = soup.find('input', {'name': 'form_token'})['value']

                payload = {'username': 'georgejetson', 'password': 'Msafsdfwe23', 'login': 'Login',
                           'redirect': './index.php?', 'creation_time': creation, 'form_token': token}

                print(token, creation)
                print(s.cookies)
                #print(r.status_code, soup)
                response = s.post(item2, headers=headers, data=payload)
                soup2 = BeautifulSoup(response.text, "html.parser")
                print(response.status_code)
                link = soup2.find('a', href=True, text='Board Administrator')
                print(link)

我已经列出了哪些URL与哪些用户代理一起工作,并且我计划分别遍历每个列表。我正在尝试尽可能提高效率。

是否有更好的方法来遍历POST请求而不更改用户代理?

0 个答案:

没有答案