我目前在使用扭曲的python库访问通过https托管的内容时遇到了一些问题。我是这个图书馆的新手,我假设有一些我错过的概念引起了这个问题,但可能不是基于这个例子。
以下是我收集示例的页面的链接: https://twistedmatrix.com/documents/current/web/howto/client.html
标题 HTTP over SSL 标题下
from twisted.python.log import err
from twisted.web.client import Agent
from twisted.internet import reactor
from twisted.internet.ssl import optionsForClientTLS
def display(response):
print("Received response")
print(response)
def main():
contextFactory = optionsForClientTLS(u"https://example.com/")
agent = Agent(reactor, contextFactory)
d = agent.request("GET", "https://example.com/")
d.addCallbacks(display, err)
d.addCallback(lambda ignored: reactor.stop())
reactor.run()
if __name__ == "__main__":
main()

运行此代码时,它直接启动失败。我得到一个如下错误:
Traceback (most recent call last):
File "https.py", line 19, in <module>
main()
File "https.py", line 11, in main
contextFactory = optionsForClientTLS(u"https://example.com/")
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 1336, in optionsForClientTLS
return ClientTLSOptions(hostname, certificateOptions.getContext())
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 1198, in __init__
self._hostnameBytes = _idnaBytes(hostname)
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 86, in _idnaBytes
return idna.encode(text)
File "/usr/local/lib/python2.7/dist-packages/idna/core.py", line 355, in encode
result.append(alabel(label))
File "/usr/local/lib/python2.7/dist-packages/idna/core.py", line 276, in alabel
check_label(label)
File "/usr/local/lib/python2.7/dist-packages/idna/core.py", line 253, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+003A at position 6 of u'https://example' not allowed
&#13;
此错误让我相信传入 optionsForClientTLS 的参数不正确。它需要主机名而不是完整的URL,因此我将参数缩短为 example.com 。完成更改后,该功能成功完成。
不幸的是,在进行更改后,脚本现在在调用 agent.request 的行中失败了。它提供的错误是:
Traceback (most recent call last):
File "https.py", line 19, in <module>
main()
File "https.py", line 13, in main
d = agent.request("GET", "https://example.com/")
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 1596, in request
endpoint = self._getEndpoint(parsedURI)
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 1580, in _getEndpoint
return self._endpointFactory.endpointForURI(uri)
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 1456, in endpointForURI
uri.port)
File "/home/amaricich/.local/lib/python2.7/site-packages/twisted/web/client.py", line 982, in creatorForNetloc
context = self._webContextFactory.getContext(hostname, port)
AttributeError: 'ClientTLSOptions' object has no attribute 'getContext'
&#13;
这个错误让我相信optionsForClientTLS生成的对象不是预期在创建时传递给代理的对象类型。正在尝试调用不存在的函数。尽管如此,我有两个问题。
答案 0 :(得分:1)
是的,您绝对正确,文档上的示例是错误的。我注意到了错误while working w/ treq
。从v14开始尝试关注this example。话虽如此,你应该使用treq
而不是试图直接使用Twisted。大部分的重物都已经为你照顾好了。这是对您的示例的简单转换:
from __future__ import print_function
import treq
from twisted.internet import defer, task
from twisted.python.log import err
@defer.inlineCallbacks
def display(response):
content = yield treq.content(response)
print('Content: {0}'.format(content))
def main(reactor):
d = treq.get('https://twistedmatrix.com')
d.addCallback(display)
d.addErrback(err)
return d
task.react(main)
正如您所见,treq
为您处理SSL问题。 display()
回调函数可用于提取HTTP响应的各种组件,例如标题,状态代码,正文等。如果您只需要一个组件,例如响应主体,那么您可以进一步简化像这样:
def main(reactor):
d = treq.get('https://twistedmatrix.com')
d.addCallback(treq.content) # get response content when available
d.addErrback(err)
d.addCallback(print)
return d
task.react(main)