我想在我的蜘蛛中尝试一些方法。 例如,在我的项目中,我有这个模式:
scanf
我的 toto/
├── __init__.py
├── items.py
├── pipelines.py
├── settings.py
├── spiders
│ ├── __init__.py
│ └── mySpider.py
└── Unitest
└── unitest.py
看起来像那样:
unitest.py
和我的# -*- coding: utf-8 -*-
import re
import weakref
import six
import unittest
from scrapy.selector import Selector
from scrapy.crawler import Crawler
from scrapy.utils.project import get_project_settings
from unittest.case import TestCase
from toto.spiders import runSpider
class SelectorTestCase(unittest.TestCase):
sscls = Selector
def test_demo(self):
print "test"
if __name__ == '__main__':
unittest.main()
,看起来像那样:
mySpider.py
在我的unitest.py文件中,如何调用我的蜘蛛?
我试图在我的unitest.py文件中添加import scrapy
class runSpider(scrapy.Spider):
name = 'blogspider'
start_urls = ['http://blog.scrapinghub.com']
def parse(self, response):
for url in response.css('ul li a::attr("href")').re(r'.*/\d\d\d\d/\d\d/$'):
yield scrapy.Request(response.urljoin(url), self.parse_titles)
def parse_titles(self, response):
for post_title in response.css('div.entries > ul > li a::text').extract():
yield {'title': post_title}
,但它没有......
我有这个错误:
Traceback(最近一次调用最后一次):文件“unitest.py”,第10行,in 来自toto.spiders导入runSpider ImportError:没有名为toto.spiders的模块
我如何解决它?
答案 0 :(得分:1)
尝试:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(os.path.realpath(__file__)), '../..')) #2 folder back from current file
from spiders.mySpider import runSpider