AttributeError响应对象没有属性选择器

时间:2015-08-14 10:03:10

标签: python attributeerror

我有新问题。最后一个(和我的第一个)得到了很好的回答。现在我有了AttributeError。它发生在爬行过程中。输出低于。我想知道它是如何发生的,因为我的代码是直接来自Scrapy官方教程的书。怎么了?再次感谢你!

2015-08-14 11:36:39 [scrapy] DEBUG: Crawled (200) <GET http://www.adacta.si/images/en/CP-Suite-Brochure.pdf> (referer: http://www.adacta.si/storitve/podpora-procesu-planiranja)
2015-08-14 11:36:39 [scrapy] ERROR: Spider error processing <GET http://www.adacta.si/images/en/CP-Suite-Brochure.pdf> (referer: http://www.adacta.si/storitve/podpora-procesu-planiranja)
Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\scrapy\utils\defer.py", line 102, in iter_errback
    yield next(it)
  File "C:\Python27\lib\site-packages\scrapy\spidermiddlewares\offsite.py", line 28, in process_spider_output
    for x in result:
  File "C:\Python27\lib\site-packages\scrapy\spidermiddlewares\referer.py", line 22, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "C:\Python27\lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "C:\Python27\lib\site-packages\scrapy\spidermiddlewares\depth.py", line 54, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "E:\analitika\SURS\tutorial\tutorial\spiders\job_spider.py", line 25, in
parse
    response.selector.remove_namespaces()
AttributeError: 'Response' object has no attribute 'selector'


#!/usr/bin/python
# -*- coding: utf-8 -*-
# encoding=UTF-8  
import scrapy, urlparse, os
from scrapy.spiders import Rule
from scrapy.linkextractors import LinkExtractor
from tutorial.items import JobItem
from scrapy.utils.response import get_base_url
from scrapy.http import Request
from urlparse import urlparse, urljoin
from datetime import datetime


class JobSpider(scrapy.Spider):
    name = "jobs"
    #allowed_domains = ["www.aclovse.si"]
    start_urls = ["http://www.adacta.si"]

    #Check list, that helps us to avoid duplication of results.
    jobs_urls = []


    def parse(self, response):

        response.selector.remove_namespaces() 

        #We choose all urls, they are defined by "href". 
        #These are either webpages on our website either new websites.
        urls = response.xpath('//@href').extract()
        #... and so on

0 个答案:

没有答案