Question

我正在使用urllib2，并尝试从Response对象中以可打印的形式提取标题。

目前我正在打印str(response.info())，但是打印的内容本身就是一个Python字符串（至少根据我的理解）。

(Pdb) p str(response.info())
'Date: Tue, 23 Feb 2010 03:12:26 GMT\r\nServer: Apache\r\nVary: Accept-Encoding,User-Agent\r\nContent-Encoding: gzip\r\nContent-Length: 9045\r\nConnection: close\r\nContent-Type: text/html; charset=ISO-8859-1\r\n'

我需要将该字符串转换为“实际”字符串，例如通过评估或类似的东西。我发现最好的理论解决方案是使用：

s = str(response.info())
print s.decode("string_escape")

但这不起作用。进一步增加混淆的是如何处理字符串中的引号，调用eval(s)和str(s)也不起作用。

是否有更好的方法可以在没有引用的情况下提取响应中的原始标头，或者如上所述解码字符串s的方法？

Answer 1

str(info()) 提供正常的字符串：

>>> import urllib2
>>> f = urllib2.urlopen('http://tejp.de')
>>> print str(f.info())
Connection: close
Vary: Accept-Encoding
Content-Type: text/html
Accept-Ranges: bytes
ETag: "-807357257"
Last-Modified: Wed, 01 Jul 2009 10:05:34 GMT
Content-Length: 285
Date: Tue, 23 Feb 2010 03:24:10 GMT
Server: lighttpd/1.4.19

只有调试器的p命令以转义形式打印字符串。

Answer 2

从pdb开始，这应该有效：

print str(response.info())

不确定这是否能回答你的问题。

Answer 3

response.info()返回httplib.HTTPMessage，其行为类似于映射：

info = response.info()
for k, v in info.items():
  print '%s: %s' % (k, v)

简而言之，你做错了。

Unescape字符串中的字符串

3 个答案: