curl命令中文/日文url编码

时间:2013-04-10 06:05:06

标签: python linux bash curl

我想向网站发送帖子请求。 下面的python代码工作正常。

# -*- encoding=utf-8 -*-
import urllib, urllib2

url = "http://xxx.com/login.asp"
req = urllib2.urlopen(url, urllib.urlencode({"name":u"汉字".encode('GB2312'),"id":u"12345"}))
print req.read()

但是bash代码返回错误的页面。是编码问题还是其他问题?

url="http://xxx.com/login.asp"
curl --data-urlencode "name=汉字&id=12345" $url

1 个答案:

答案 0 :(得分:1)

In [4]: urllib.urlencode({"name":"汉字".decode('utf-8').encode('GB2312'),"id":u"12345"})
Out[4]: 'name=%BA%BA%D7%D6&id=12345'

根据卷曲手册页,

--data-urlencode <data>
      ...
      The <data> part can be passed to
      curl using one of the following syntaxes:
      ...
      name=content
      This will make curl URL-encode the content part and pass that on.
      Note that the name part is expected to be URL-encoded already.

由于curl会对内容进行URL编码,我们需要传递一个尚未进行URL编码的字符串:

In [7]: urllib.unquote(urllib.urlencode({"name":"汉字".decode('utf-8').encode('GB2312'),"id":u"12345"}))
Out[7]: 'name=\xba\xba\xd7\xd6&id=12345'

因此,请尝试

url="http://xxx.com/login.asp"
curl --data-urlencode 'name=\xba\xba\xd7\xd6&id=12345' $url