在scrapy中创建循环内的循环

时间:2017-01-10 12:18:35

标签: python loops for-loop scrapy

我有两组代码,它们会产生scrapy的不同结果:

包含以上两个代码示例:

b_result_list = []
b_result_page = []

1)b_result_page

NEXT_PAGE_SELECTOR = 'a.sb_pagN ::attr(href)'
next_page = response.css(NEXT_PAGE_SELECTOR).extract_first()
if next_page:
    yield scrapy.Request(
        response.urljoin(next_page),
        callback=self.parse
    )
    b_result_page.append(next_page) 

产生的示例数据:

            ['b.com/search?q=site%3asite.com&first=11&FORM=PORE',
     b.com/search?q=site%3asite.com&first=21&FORM=PORE']

2)b_result_list

LIST_SELECTOR = '.b_algo'
for bresult in response.css(LIST_SELECTOR):
    NAME_SELECTOR = 'h2 a ::attr(href)'
    yield {
        'name': bresult.css(LIST_SELECTOR).extract(),
    }
    b_result_list.append(bresult)

由此产生的示例数据:

['somesite.com', 'blog.somesite.com', 'somesite.com/about/contactus.php']

问题:我怎么能这样做(我无法理解我的想法):

b_result_page中的每个页面extract linksb_result_page访问b_result_list

此代码如何解决我的问题

for brp in b_result_page:
    LIST_SELECT = '.b_algo'
    for page_item_result in response.css(LIST_SELECT):
        NAME_SELECT = 'h2 a ::attr(href)'
        yield {
            'name' : page_item_result.css(LIST_SELECT).extract(),
        }
        b_result_list.append(page_item_result)

由于

0 个答案:

没有答案