我正在尝试抓取一个网站。但是,主机会继续重定向蜘蛛,直到它到达max redirections reached
。日志如下:
2019-08-21 17:10:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://zjip.patsev.com/pldb-zj/> from <GET http://zjip.patsev.com/>
2019-08-21 17:10:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (meta refresh) to <GET http://zjip.patsev.com/pldb-zj/access/toLogin> from <GET http://zjip.pa
tsev.com/pldb-zj/>
2019-08-21 17:10:57 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06
&response_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://zjip.patsev.com/pldb-zj/access/toLogin>
2019-08-21 17:10:57 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://zjip.patsev.com/pldb-zj/?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin>
2019-08-21 17:10:58 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (meta refresh) to <GET http://zjip.patsev.com/pldb-zj/access/toLogin> from <GET http://zjip.pa
tsev.com/pldb-zj/?client_id=8A3C47AC471F1D588A0F84B93E540C06&response_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin>
2019-08-21 17:10:58 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06
&response_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://zjip.patsev.com/pldb-zj/access/toLogin>
2019-08-21 17:10:58 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://zjip.patsev.com/pldb-zj/?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin>
2019-08-21 17:10:59 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (meta refresh) to <GET http://zjip.patsev.com/pldb-zj/access/toLogin> from <GET http://zjip.pa
tsev.com/pldb-zj/?client_id=8A3C47AC471F1D588A0F84B93E540C06&response_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin>
2019-08-21 17:10:59 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06
&response_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://zjip.patsev.com/pldb-zj/access/toLogin>
2019-08-21 17:11:00 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://zjip.patsev.com/pldb-zj/?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin>
重定向直到
似乎都是必要的2019-08-21 17:10:57 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://zjip.patsev.com/pldb-zj/?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin> from <GET http://open.cnipr.com/oauth/authorize?client_id=8A3C47AC471F1D588A0F84B93E540C06&respo
nse_type=code&redirect_uri=http://zjip.patsev.com/pldb-zj/access/oauthLogin>
这一点,但是之后它仍然不断刷新。
您知道如何查看重定向的响应以及如何在适当的位置停止重定向吗?非常感谢!
更新:我检查浏览器中的URL为http://zjip.patsev.com/
。如果我使用requests
,就不会有同样的问题
res = requests.get('http://zjip.patsev.com/', proxies=proxy_dict, headers=headers)