为什么curl无法获得与浏览器相同的响应?

时间:2017-04-26 09:56:09

标签: internet-explorer curl

我有一个像这样运行的api:

result in ie browser

所以我想在我的ubuntu机器中使用curl获取结果,然后我尝试一下:

curl -v \
--header 'Accept: text/html, application/xhtml+xml, */*' \
--header 'Accept-Language: zh-CN' \
-A 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko' \
--header 'Accept-Encoding: gzip, deflate' \
--header 'Host: 10.202.15.197:20176' \
--header 'DNT: 1' \
--header 'Connection: Keep-Alive' \
http://10.202.15.197:20176?user_id=1&query_type=GEOSPLIT&address=广东省深圳市宝安&ret_splitinfo=1

result with curl

然后奇怪的事情发生了:正如你所看到的,我得到了完全不同的结果,即browswer ,所以我认为它必须是编码问题,然后我试试这个:

curl -v \
--header 'Accept: text/html, application/xhtml+xml, */*' \
--header 'Accept-Language: zh-CN' \
-A 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko' \
--header 'Accept-Encoding: gzip, deflate' \
--header 'Host: 10.202.15.197:20176' \
--header 'DNT: 1' \
--header 'Connection: Keep-Alive' \
http://10.202.15.197:20176 --data-urlencode 'user_id=1&query_type=GEOSPLIT&address=广东省深圳市宝安&ret_splitinfo=1'

I get total different result

但是没有,它会返回相同的结果,我在我的Windows浏览器中通过fiddler抓住了我的请求,我收到了请求数据:

GET http://10.202.15.197:20176/?user_id=1&query_type=GEOSPLIT&address=广东省深圳市宝安&ret_splitinfo=1 HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: zh-CN
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: 10.202.15.197:20176
DNT: 1
Connection: Keep-Alive
Pragma: no-cache


HTTP/1.0 200 OK
Content-Type: application/octet-stream
Connection: close
Content-Length: 222

<?xml version='1.0' encoding='GBK'?>
<addrSplitInfo>
<status>0</status><as_info prop="1" level="1">广东省</as_info>
<as_info prop="1" level="2">深圳市</as_info>
<as_info prop="3" level="18">宝安</as_info>
</addrSplitInfo>

1 个答案:

答案 0 :(得分:0)

我对这一现象做了进一步的研究(见here)。这样做的原因是IE browser同时使用GBK作为默认编码Chrome Firefox cURL python-requests只需使用UTF-8编码

以下是使用此API的解决方案:

cURL

echo "http://10.202.15.197:20176\?user_id\=1\&query_type\=GEOSPLIT\&address\=广东省深圳市宝安\&ret_splitinfo\=1" | iconv -f utf-8 -t gbk | xargs curl

python-requests

payload = {"user_id": 1, "query_type": "GEOSPLIT", "address": u"广东省深圳市宝安".encode('gbk'), "ret_splitinfo": 1}
r = requests.get("http://10.202.15.197:20176", payload)
print r.text