我为django网页编写了代码,其中包含一个用户输入表单。当用户在表单中输入文本并单击提交按钮时,需要启动运行scrapy spider的芹菜任务。该表单实际上采用了一个带的名称,该带将作为参数传递给蜘蛛并连接到起始URL。到目前为止,从我的代码,每当命令 python manage.py celery worker --lvelvel = info 或 python manage.py runserver 时,scrapy spider的日志开始执行,但从未实际显示正常抓取的网页。但是,当我提交表单请求时,scrapy spider没有运行。单击提交按钮时,运行芹菜任务的正确方法是什么。我正在遵循this SO post的解决方案,但Scrapy和芹菜已经更新,解决方案现在似乎没有起作用。相关文件的代码如下:
tasks.py
from celery.registry import tasks
from celery.task import Task
from django.template.loader import render_to_string
from django.utils.html import strip_tags
from django.core.mail import EmailMultiAlternatives
from ticket_city_scraper.ticket_city_scraper.spiders.tc_spider import spiderCrawl
from celery import shared_task
@shared_task
def crawl():
return spiderCrawl()
从视图文件中可以看出,只在选择视图中调用了爬网方法,但每次访问新页面时,蜘蛛日志都会启动
views.py
from django.shortcuts import render
from .forms import ContactForm, SignUpForm, BandForm
from tasks import crawl
def choice(request):
title = 'Welcome'
form = SignUpForm(request.POST or None)
context = {
"title" : title,
"form" : form,
}
if form.is_valid():
instance = form.save(commit = False)
full_name = form.cleaned_data.get("full_name")
if not full_name:
full_name = "New full name"
instance.full_name = full_name
# if not instance.full_name:
# instance.full_name = "A name"
instance.save()
context = {
"title" : "Thank you",
}
crawl.delay()
return render(request, "home.html", context)
运行服务器时的终端窗口
-------------- celery@elijah-VirtualBox v3.1.18 (Cipater)
---- **** -----
--- * *** * -- Linux-3.13.0-54-generic-x86_64-with-Ubuntu-14.04-trusty
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: default:0x7faaebc80410 (djcelery.loaders.DjangoLoader)
- ** ---------- .> transport: amqp://guest:**@localhost:5672//
- ** ---------- .> results: database
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. comparison.tasks.crawl
[2015-08-21 23:15:21,076: INFO/MainProcess] Connected to amqp://guest:**@127.0.0.1:5672//
[2015-08-21 23:15:21,186: INFO/MainProcess] mingle: searching for neighbors
[2015-08-21 23:15:22,244: INFO/MainProcess] mingle: all alone
/home/elijah/Desktop/trydjango18/trydjango18/local/lib/python2.7/site-packages/djcelery/loaders.py:136: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warn('Using settings.DEBUG leads to a memory leak, never '
[2015-08-21 23:15:22,331: WARNING/MainProcess] /home/elijah/Desktop/trydjango18/trydjango18/local/lib/python2.7/site-packages/djcelery/loaders.py:136: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warn('Using settings.DEBUG leads to a memory leak, never '
[2015-08-21 23:15:22,333: WARNING/MainProcess] celery@elijah-VirtualBox ready.
[2015-08-21 23:15:24,294: INFO/MainProcess] Received task: comparison.tasks.crawl[d930a0e8-7d63-4d55-ba85-53bb174f98f4]
[2015-08-21 23:15:24,296: INFO/MainProcess] Received task: comparison.tasks.crawl[37187368-cfd1-4b9e-9a2e-8e14266947ef]
[2015-08-21 23:15:24,298: INFO/MainProcess] Received task: comparison.tasks.crawl[d5aa8448-2ee5-47f9-8b6e-5112201665ef]
[2015-08-21 23:15:24,300: INFO/MainProcess] Received task: comparison.tasks.crawl[d8ae8663-3fe1-484b-b43b-d54f173fd85e]
[2015-08-21 23:15:24,301: INFO/MainProcess] Received task: comparison.tasks.crawl[1eb42061-ec5a-4697-9df8-9b07c62f04f9]
[2015-08-21 23:15:24,302: INFO/MainProcess] Received task: comparison.tasks.crawl[d3a7619f-2fcc-4105-93f8-b2ac9004593b]
[2015-08-21 23:15:24,303: INFO/MainProcess] Received task: comparison.tasks.crawl[2b06afd0-24ab-4198-a49e-b32dfe0ca804]
[2015-08-21 23:15:24,505: ERROR/MainProcess] Task comparison.tasks.crawl[37187368-cfd1-4b9e-9a2e-8e14266947ef] raised unexpected: NameError("global name 'MySpider' is not defined",)