如何将start_url添加为项目?

时间:2016-01-20 17:50:33

标签: python scrapy

我是Python和Scrapy的新手。我希望item['Source_Website']成为我抓取的网址。我怎样才能做到这一点?

我尝试了item['Source_Website'] = selector.ulritem['Source_Website'] = start_urls,但没有运气。

from scrapy.selector import Selector
from scrapy.spider import BaseSpider
from shikari.items import ShikariItem

class Radiate (BaseSpider) :
  name = "sss"
  download_delay = 3
  concurrent_requests = 1
  allowed_domains = ["website.com"]
  start_urls = ['http://www.website.com/1',
                'http://www.website.com/2']

  def parse(self, response) :
    sel = Selector (response)
    item = ShikariItem ()
    item['Heading'] = str (sel.xpath ('//h1/text()').extract ())
    item['Source_Website'] = 
    return item

1 个答案:

答案 0 :(得分:1)

使用response.url,如下所示:

item['Source_Website'] = response.url