我正在使用urllib2与发回多个Set-Cookie标头的网站进行交互。但是响应头字典只包含一个 - 似乎重复键相互重叠。
有没有办法用urllib2访问重复的标题?
答案 0 :(得分:5)
根据urllib2 docs,结果网址对象的.headers
属性为httplib.HTTPMessage
(似乎未记录,至少在Python文档中)。
然而,
help(httplib.HTTPMessage)
...
If multiple header fields with the same name occur, they are combined
according to the rules in RFC 2616 sec 4.2:
Appending each subsequent field-value to the first, each separated
by a comma. The order in which header fields with the same field-name
are received is significant to the interpretation of the combined
field value.
因此,如果您访问u.headers ['Set-Cookie'],您应该获得一个Set-Cookie标头,其值以逗号分隔。
事实上,情况确实如此。
import httplib
from StringIO import StringIO
msg = \
"""Set-Cookie: Foo
Set-Cookie: Bar
Set-Cookie: Baz
This is the message"""
msg = StringIO(msg)
msg = httplib.HTTPMessage(msg)
assert msg['Set-Cookie'] == 'Foo, Bar, Baz'
答案 1 :(得分:0)
set-cookie
虽然不同。来自RFC 6265:
原始服务器不应该将多个Set-Cookie头字段折叠成 单个标题字段。折叠HTTP标头的常用机制 字段(即[RFC2616]中定义的)可能会改变语义 Set-Cookie标头字段,因为使用了%x2C(",")字符 通过Set-Cookie以与这种折叠相冲突的方式。
理论上,这看起来像个错误。