我想抓取一系列Facebook帖子。为此,我进行登录,然后加载帖子ID列表,以使请求仅记录一次。 但是,当我尝试使用yield发出请求时,它不会进入for循环。
仅出于测试目的,我更改了收益率,它确实进入了for循环并调用了parse方法。
``` lang-py
class FacebookSpider(scrapy.Spider):
name = "test"
start_urls = ['https://mbasic.facebook.com']
def parse(self, response):
return FormRequest.from_response( response, callback=self.parse_home,
formxpath='//form[contains(@action, "login")]',
formdata={'email': "email@email.com", 'pass': "password"}, )
def parse_home(self, response):
print(">> parse_home")
if response.xpath("//div/input[@value='Ok' and @type='submit']"):
print(">> if condition")
return FormRequest.from_response(response, formdata={'name_action_selected': 'dont_save'}, callback=self.parse_home, dont_filter=True,)
for post in [1,2]:
print(">> for loop")
href = response.urljoin("/335653391129/posts/10157014203171130".format(post))
yield scrapy.Request(url=href, callback=self.parse_page, dont_filter=True,)
def parse_page(self, response):
print("____ parse_page _________")
```
使用yield的输出是:
>> parse_home
>> if condition
仅更改收益以返回输出的是:
>> parse_home
>> if condition
>> parse_home
>> for loop
____ parse_page _________
我不知道发生了什么。 预先谢谢你,
答案 0 :(得分:1)
您的@font-face {
font-family: "IconFont";
src: url(/static/media/IconFont.d9fff078.eot);
src: url(/static/media/IconFont.d9fff078.eot#iefix)
format("embedded-opentype"),
url(/static/media/IconFont.ad47b1fb.ttf) format("truetype"),
url(/static/media/IconFont.c8a8e064.woff) format("woff"),
url(/static/media/IconFont.979fb19e.svg#IconFont) format("svg");
font-weight: normal;
font-style: normal;
}
方法是一个生成器,您不应该在生成器内部使用The resource http://localhost:3000/static/media/IconFont.ad47b1fb.ttf was
preloaded using link preload but not used within a few seconds from the
window's load event. Please make sure it has an appropriate `as` value and
it is preloaded intentionally.
。但是我测试了您的代码,并且似乎可以正常工作。
有关Python SyntaxError: ("'return' with argument inside generator",)的更多信息