如何在django视图中运行带有参数的scrapy蜘蛛

时间:2016-04-27 16:47:05

标签: python django web-scraping scrapy

用户可以在表单中输入关键字并提交,因此我可以在视图中获取关键字。然后我可以用关键字创建start_url。 如何将start_url传递给scrapy蜘蛛并启动它?

这是我的查看方法。

def results(request):
    """Return the search results"""
    key= request.GET['keyword'].strip()
    books = Book.objects.filter(title__contains=key)
    if books is None:
        # I want to call the scrapy spider here.
        pass
        books = Book.objects.filter(title__contains=key)
    context = {'books': books, 'key': title}
    return render(request, 'search/results.html', context)

这是我的蜘蛛类的 init ()方法。

def __init__(self, key):
    self.key = key
    url = "http://search.example.com/?key=" + key
    self.start_urls = [url]

1 个答案:

答案 0 :(得分:4)

这对我有用:

from scrapy.crawler import CrawlerRunner
from scrapy.utils.project import get_project_settings
if books is None:
    # I want to call the scrapy spider here.
    os.environ.setdefault("SCRAPY_SETTINGS_MODULE","whereyourscrapysettingsare")
crawler_settings = get_project_settings()
crawler = CrawlerRunner(crawler_settings)
crawler.crawl(yourspider, key=key)

来自http://doc.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script