这个卷曲有效。
https://<user>:<pass>@xecdapi.xe.com/v1/convert_from.json/?from=1000000&to=SGD&amount=AED,AUD,BDT&inverse=True
但是这个Scrapy请求不起作用。
yield scrapy.Request("https://<user>:<pass>@xecdapi.xe.com/v1/convert_from.json/?from=1000000&to=SGD&amount=AED,AUD,BDT&inverse=True")
It returns this error:
Traceback (most recent call last):
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\internet\defer.py", line 1297, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request
return handler.download_request(request, spider)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 61, in download_request
return agent.download_request(request)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 286, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\web\client.py", line 1596, in request
endpoint = self._getEndpoint(parsedURI)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\web\client.py", line 1580, in _getEndpoint
return self._endpointFactory.endpointForURI(uri)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\web\client.py", line 1456, in endpointForURI
uri.port)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\scrapy\core\downloader\contextfactory.py", line 59, in creatorForNetloc
return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext())
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\internet\_sslverify.py", line 1201, in __init__
self._hostnameBytes = _idnaBytes(hostname)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\twisted\internet\_sslverify.py", line 87, in _idnaBytes
return idna.encode(text)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\idna\core.py", line 355, in encode
result.append(alabel(label))
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\idna\core.py", line 276, in alabel
check_label(label)
File "d:\kerja\hit\python~1\<project_name>\<project_name>\lib\site-packages\idna\core.py", line 253, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
InvalidCodepoint: Codepoint U+003A at position 28 of u'xxxxxxxxxxxxxxxxxxxxxxxxxxxx:xxxxxxxxxxxxxxxxxxxxxxxxxxx@xecdapi' not allowed
答案 0 :(得分:3)
Scrapy不支持通过URL进行HTTP身份验证。我们必须使用HTTPAuthMiddleware。
settings.py
中的:
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware': 811,
}
蜘蛛中的:
from scrapy.spiders import CrawlSpider
class SomeIntranetSiteSpider(CrawlSpider):
http_user = 'someuser'
http_pass = 'somepass'
name = 'intranet.example.com'
# .. rest of the spider code omitted ...