我需要从此页面获取每个产品的网址http://www.stalkbuylove.com/new-arrivals/week-2.html#/page/1 然后需要从产品链接中获取每个产品的详细信息。我不知道该怎么做。
import scrapy
import json
import redis
r_server = redis.Redis('localhost')
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["stalkbuylove.com"]
start_urls = [
"http://www.stalkbuylove.com/new-arrivals/week-2.html#/page/1"
]
def parse(self, response):
for sel in response.css('.product-detail-slide'):
name = sel.xpath('div/a/@title').extract()
price = sel.xpath('div/span/span/text()').extract()
productUrl = sel.xpath('div/a/@href').extract()
request = scrapy.Request(''.join(productUrl), callback=self.parseProductPage)
r_server.hset(name,"Name",name)
r_server.hset(name,"Price",price)
r_server.hset(name,"ProductUrl",productUrl)
print name, price, productUrl
def parseProductPage(self, response):
for sel in response.css('.top-details-product'):
availability = sel.xpath('div/link/@href').extract()
print availability
有人可以帮忙吗?当我得到产品网址时如何抓取该网址?现在我正在调用parseProductUrlPage,它不能正常工作。