Question

我是Scrapy，Python的新手。我需要提取网址的标题而不是上下文。下面的代码提取内容和标题。在上面提供帮助

提前谢谢你。

class BlogSpider(scrapy.Spider):
         name = 'bg'
         start_urls = ['https://blog.scrapinghub.com', 'https://scrapinghub.com/']

     def parse(self, response):
        for title in response.css('h2.entry-title'):
            yield {'title': title.css('a ::text').extract_first()}

        page = response.url.split("/")[-2]
        filename = 'urltitle-%s.html' %page
        with open(filename,'wb') as f:
           f.write(response.body)

Answer 1

我不确定我是否正确理解了＆＃39; title＆＃39;的含义，但是如果您需要提取标记public void Proc(object parameter1, object parameter2, string string1) { Validator.ThrowIfNull(() => parameter1); Validator.ThrowIfNull(() => parameter2); Validator.ThrowIfNullOrEmpty(() => string1); // Main code. }的{{1}}属性，则可以使用适当的选择器title来提取它{1}}

使用scrapy，python提取url的标题

1 个答案: