使用Python的请求通过WebKitFormBoundary重新创建POST请求

时间:2018-07-15 14:27:52

标签: python python-requests

我正在尝试使用带有Python requests库的POST请求从网站上抓取一些数据。不幸的是,我无法发布该页面的链接,因为您必须登录该网站才能使用该页面。

我要复制的请求的文件扩展名为.ehtml,这是我要重新创建的请求有效负载的一部分:

------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="session_id"

W0pNKn8AAQEAACD-XkYAAAAJ
------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="p_session_id"

W0pMOH8AAQEAABZSUVkAAAAD
------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="attach_key"


------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="chosen"

0
------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="debug"


------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="language"

en
------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="game_system_id"

NULL
------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="collection_detail_id"

NULL
------WebKitFormBoundary8rntuVzldIBHkILv
Content-Disposition: form-data; name="competition_id"

NULL

借助一些有关stackoverflow的问题的帮助,到目前为止,我已经成功地重新创建了它:

--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="session_id"


--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="p_session_id"


--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="attach_key"


--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="chosen"

0
--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="debug"


--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="language"

en
--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="game_system_id"

NULL
--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="collection_detail_id"

NULL
--30b11983bde849109a3dc93e139e16d4
Content-Disposition: form-data; name="competition_id"

NULL

这是使用以下代码完成的:

Q = {
     "session_id" : (None,""),
     "p_session_id" : (None,""),
     "attach_key" : (None,""),
     "chosen" : (None,"0"),
     "debug" : (None,""),
     "language" : (None,"en"),
     "game_system_id" : (None,"NULL"),
     "collection_detail_id" : (None,"NULL"),
     "competition_id" : (None,"NULL")
     }


with requests.Session() as s:
    p = s.post(login_URL2,data=payload)
    #print(p.text)

    #d = s.post(req_url,files=Q)
    d2 = Request("POST",req_url,files=Q)    


d3 = d2.prepare()
print(d3.body.decode('utf-8'))

我相信我缺少的最后一件事是WebKitFormBoundary部分,我找不到任何地方如何插入该部分。这是我第一次使用.ehtml文件进行抓取,因此,如果我错过了其他明显的问题,我们将不胜感激。

5 个答案:

答案 0 :(得分:3)

import requests
import random,string
from requests_toolbelt import MultipartEncoder

fields = {
    'file': ('test.png', your_data, "image/png"),
    'file_id': "0"
}
boundary = '----WebKitFormBoundary' \
           + ''.join(random.sample(string.ascii_letters + string.digits, 16))
m = MultipartEncoder(fields=fields, boundary=boundary)

headers = {
    "Host": "xxxx",
    "Connection": "keep-alive",
    "Content-Type": m.content_type
}

req = requests.post('https://xxxx/api/upload', headers=headers, data=m)
print(req.text)

通过这种方式,我们可以制作像------WebKitFormBoundary8rntuVzldIBHkILv这样的边界格式。

答案 1 :(得分:1)

边界的确切名称并不重要,只要在标头中声明了边界即可

Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08jU534c0p

有了此标头,边界将是

--gc0p4Jq0M2Yt08jU534c0p

服务器将查看Content-Type标头并找出正文部分。

答案 2 :(得分:1)

当您通过jQuery发送ajax请求并且要发送FormData时,无需在此FormData上使用JSON.stringify。同样,当您发送文件时,内容类型必须是包含边界的multipart / form-data-类似于multipart / form-data; boundary = ---- WebKitFormBoundary0BPm0koKA

So to send FormData including some file via jQuery ajax you need to:

Set data to the FormData without any modifications.
Set processData to false (Lets you prevent jQuery from automatically transforming the data into a query string).
Set the contentType to false (This is needed because otherwise jQuery will set it incorrectly).
Your request should look like this:

var formData = new FormData();

formData.append('name', dogName);
// ... 
formData.append('file', document.getElementById("dogImg").files[0]);


$.ajax({
    type: "POST",
    url: "/foodoo/index.php?method=insertNewDog",
    data: formData,
    processData: false,
    contentType: false,
    success: function(response) {
        console.log(response);
    },
    error: function(errResponse) {
        console.log(errResponse);
    }
});

答案 3 :(得分:0)

------ WebKitFormBoundary89uZMBZwSHfYjySK 内容处置:表单数据; name =“ account_number”

等等 ------ WebKitFormBoundary89uZMBZwSHfYjySK 内容处置:表单数据; name =“ date_of_birth”

等等 ------ WebKitFormBoundary89uZMBZwSHfYjySK 内容处置:表单数据; name =“ first_name”

等等 ------ WebKitFormBoundary89uZMBZwSHfYjySK 内容处置:表单数据; name =“ last_name”

等等 ------ WebKitFormBoundary89uZMBZwSHfYjySK-

我基本上已经将这些webkitform边界转换为JSON,如下所示:

导入请求

数据= { “ account_number”:等等, “ date_of_birth”:“等等”, “ first_name”:“等等”, “ last_name”:“等等” }

标题= { “授权”:“承载等等” }

req = requests.post('https://rest.blah/v1/blah/sign-in',headers = headers,data = data) 打印(要求内容)

响应:

b'{“代码”:200,“数据”:{“ user_id”:“ 15442”,“ building_id”:“ 11”,“ apartment_id”:“ 4192”}}

答案 4 :(得分:0)

手动设置 Content-Type 标头意味着它缺少边界参数。删除该标题并允许 fetch 生成完整的内容类型。它看起来像这样:

Content-Type: multipart/form-data;boundary=----WebKitFormBoundaryyrV7KO0BoCBuDbTL

Fetch 根据作为请求正文内容传入的 FormData 对象知道要创建哪种内容类型标头。