使用Python中的请求进行表单提交/ POST

时间:2014-12-02 02:23:03

标签: python forms python-requests

我正在尝试使用Python中的请求模块向网站提交表单,但我的表单未正确提交。我可以在网站上手动正确提交表单,但我认为我的代码有问题导致Python提交失败。我可以使用Python成功登录网站并访问页面/发出GET请求。我可以向我提交表单的页面发出GET请求,它将成功加载,即请求登录工作。我已经包含了下面的所有输出,包括Python代码,Python中的无效表单提交以及浏览器提交的有效表单。这可能是矫枉过正,但我​​对此缺乏经验,我不确定有什么必要。

我的登录代码是:

    s = requests.Session()

    data = s.get(login_url)
    authToken = re.search(('name="authenticity_token"[\s]'
                       'type="hidden"[\s]+value="(.+)"'), \
                      data.text).group(1)
    data_dict = {
    'utf8': '✓',
    'authenticity_token': authToken,
    'admin[email]': username,
    'admin[password]': password,
    'admin[remember_me]': '1',
    'commit': 'Sign in'
    }

    s.post(login_url, data_dict)

这成功登录我,我可以向任何页面提交GET请求并获得有效结果。

我提交表单的代码:

    payload = {
    'utf8': '✓',
    'authenticity_token': authToken,
    'progress_course[name]': name,
    'progress_course[description]': desc,
    'commit': 'Create Course'
    }


    headers = {
            'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
            'Accept-Encoding':'gzip, deflate',
            'Accept-Language':'en-US,en;q=0.8',
            'Cache-Control':'max-age=0',
            'Connection':'keep-alive',
            'Content-Length':'176',
            'Content-Type':'application/x-www-form-urlencoded',
            'Host':'xxx.com',
            'Origin':'https://xxx.com',
            'Referer':'https://xxx.com/workshop/progress/courses/new',
            'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36'
    }

    response = s.post(path,data=payload,headers=headers)

我的表单提交无效。这是python日志记录模块输出:

POST:

send: 'POST /workshop/progress/courses HTTP/1.1
Origin: https://xxx.com\r\nContent-Length: 181
Accept-Language: en-US,en;q=0.8
Accept-Encoding: gzip, deflate
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36
Host: xxx.com
Referer: https://xxx.com/workshop/progress/courses/new
Cache-Control: max-age=0
Cookie: _brainfit_session=xxx; remember_admin_token=xxx
Content-Type: application/x-www-form-urlencoded
progress_course%5Bdescription%5D=pytest&utf8=%26%23x2713%3B&commit=Create+Course&progress_course%5Bname%5D=pytest&authenticity_token=xxx

回复POST。您可以看到这来自登录页面,而不是生成新页面,如右下方正确的输出所示。

reply: 'HTTP/1.1 302 Found\r\n'
header: Server: nginx
header: Date: Tue, 02 Dec 2014 01:57:46 GMT
header: Content-Type: text/html; charset=utf-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Status: 302 Found
header: Strict-Transport-Security: max-age=31536000
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block
header: X-Content-Type-Options: nosniff
header: Location: https://xxx.com/admins/sign_in
header: Cache-Control: no-cache
header: X-Request-Id: 983472d4-8954-4106-904a-38ea3b6a76a1
header: X-Runtime: 0.039963
DEBUG:requests.packages.urllib3.connectionpool:"POST /workshop/progress/courses HTTP/1.1" 302 None

重定向GET和回复。我可能会误解这些实际上是什么:

send: 'GET /admins/sign_in HTTP/1.1
Origin: https://xxx.com
Accept-Language: en-US,en;q=0.8
Accept-Encoding: gzip, deflate
Host: xxx.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36
Connection: keep-alive
Referer: https://xxx.com/workshop/progress/courses/new
Cache-Control: max-age=0
Cookie: _brainfit_session=xxx; remember_admin_token=xxx
Content-Type: application/x-www-form-urlencoded


reply: 'HTTP/1.1 302 Found\r\n'
header: Server: nginx
header: Date: Tue, 02 Dec 2014 01:57:46 GMT
header: Content-Type: text/html; charset=utf-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Status: 302 Found
header: Strict-Transport-Security: max-age=31536000
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block
header: X-Content-Type-Options: nosniff
header: Location: https://xxx.com/workshop/progress/courses
header: Cache-Control: no-cache
header: Set-Cookie: _brainfit_session=xxx; path=/; secure; HttpOnly
header: X-Request-Id: 9e5535d6-143c-4e98-8bb4-35dd29ab045d
header: X-Runtime: 0.007505
DEBUG:requests.packages.urllib3.connectionpool:"GET /admins/sign_in HTTP/1.1" 302 None


send: 'GET /workshop/progress/courses HTTP/1.1
Origin: https://xxx.com
Accept-Language: en-US,en;q=0.8
Accept-Encoding: gzip, deflate
Host: xxx.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36
Connection: keep-alive
Referer: https://xxx.com/workshop/progress/courses/new
Cache-Control: max-age=0
Cookie: _brainfit_session=exxx; remember_admin_token=xxx
Content-Type: application/x-www-form-urlencoded

reply: 'HTTP/1.1 200 OK\r\n'
header: Server: nginx
header: Date: Tue, 02 Dec 2014 01:57:46 GMT
header: Content-Type: text/html; charset=utf-8
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: Status: 200 OK
header: Strict-Transport-Security: max-age=31536000
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block
header: X-Content-Type-Options: nosniff
header: Cache-Control: max-age=0, private, must-revalidate
header: Set-Cookie: _brainfit_session=xxx; path=/; secure; HttpOnly
header: X-Request-Id: 4e759b60-af1c-4f39-b033-71ff99b62df4
header: X-Runtime: 0.019001
header: Content-Encoding: gzip
DEBUG:requests.packages.urllib3.connectionpool:"GET /workshop/progress/courses HTTP/1.1" 200 None

网站上正确提交的表单的POST标题:

Remote Address:xxx
Request URL:https://xxx.com/workshop/progress/courses
Request Method:POST
Status Code:302 Found

请求标题

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:176
Content-Type:application/x-www-form-urlencoded
Cookie:__utma=xxx; __utmc=xxx; __utmz=xxx.utmcsr=google|utmccn=(organic)|utmcmd=organic|
utmctr=xxx; km_lv=x; kvcd=xxx; km_ai=xxx; km_ni=xxx; km_uq=xxx; has_logged_in=true; WT_FPC=id=xxx; _ga=xxx; _brainfit_session=xxx
Host:xxx.com
Origin:https://xxx.com
Referer:https://xxx.com/workshop/progress/courses
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36

表单数据

utf8:✓
authenticity_token:xxx
progress_course[name]:test03
progress_course[description]:test03
commit:Create Course

响应标题

Cache-Control:no-cache
Connection:keep-alive
Content-Type:text/html; charset=utf-8
Date:Mon, 01 Dec 2014 14:01:37 GMT
Location:https://xxx.com/workshop/progress/courses/44
Server:nginx
Set-Cookie:_brainfit_session=xxx; path=/; secure; HttpOnly
Status:302 Found
Strict-Transport-Security:max-age=31536000
Transfer-Encoding:chunked
X-Content-Type-Options:nosniff
X-Frame-Options:SAMEORIGIN
X-Request-Id:8bb04242-bc5a-408f-99a5-9d8bf4eeb611
X-Runtime:0.028559
X-XSS-Protection:1; mode=block

正确提交表单后,还会从其重定向到的页面获得GET响应。我已在此处排除了其他输出,但如果它有助于解决问题,则可以添加它。

Remote Address:xxx
Request URL:https://xxx.com/workshop/progress/courses/44
Request Method:GET
Status Code:200 OK

0 个答案:

没有答案