我按照Scrapy文档发送帖子请求。它没有显示错误,但没有返回任何内容。
return scrapy.FormRequest('https://www.crowdfunder.com/company/setInvestSection', formdata={'section':'company-custom',
'company_id':'22392'}, callback=self.parse_custom)
def parse_custom(self, response):
print response.url
print response.xpath('//section[@id="custom"]').extract_first()
以下是日志信息:
2016-10-26 13:50:52 [scrapy] DEBUG: Crawled (200) <GET https://www.crowdfunder.com/?q=filter&page=1> (referer: https://www.crowdfunder.com/)
2016-10-26 13:50:55 [scrapy] DEBUG: Crawled (200) <GET https://www.crowdfunder.com/digitzs> (referer: https://www.crowdfunder.com/?q=filter&page=1)
41840
https://www.crowdfunder.com/digitzs
2016-10-26 13:50:56 [scrapy] DEBUG: Crawled (200) <POST https://www.crowdfunder.com/company/setInvestSection> (referer: https://www.crowdfunder.com/digitzs)
https://www.crowdfunder.com/company/setInvestSection
<section id="custom" data-company-id="22392">
</section>
2016-10-26 13:50:56 [scrapy] INFO: Closing spider (finished)
2016-10-26 13:50:56 [scrapy] INFO: Dumping Scrapy stats:
Here is the url and form data that I sent the Post request
来自浏览器的通缉响应:
<section id="custom" data-company-id="22392">
<h2 class="section-title">See Kevin Harrington Talk About Digitzs</h2>
<div class="custom-content"><p><a href="https://youtu.be/gbLS0FsqYw8" target="_blank"><img src="https://com-prod.s3.amazonaws.com/site/company/custom/22392/site/custom_content/6cf1786e396758009a82aa1006b6148e.jpg" alt="Kevin Harrington Invests in Digitzs" /></a></p></div>
</section>
使用启动后的可行代码:
scrapy_splash.SplashFormRequest('https://www.crowdfunder.com/company/setInvestSection',
self.parse_custom,
meta={'company_id': company_id},
formdata={'section':'company-custom','company_id':'22392'},
endpoint='http://192.168.99.100:8050/render.html')