Question

我使用rabbitMQ作为我的蜘蛛，每只蜘蛛都将数据发送到接收器。
我使用此命令启动receiver.py作为守护程序：daemon python /receiver.py

当我启动多个蜘蛛实例时，我觉得队列＆＃34;已过期＆＃34; 有更多receiver.py个实例。

我的代码出了什么问题？

发件人的工作原理如下（Scrapy spider）：

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='expired')

def parse_items(self, response):
    for link in LxmlLinkExtractor(allow=(), deny=self.allowed_domains, canonicalize=False).extract_links(response):
         # self.logger.info('[{}] Added to the List'.format(domain))
         channel.basic_publish(exchange='', routing_key='expired', body=domain, )
         self.domains.append(domain)

接收者执行此操作：

class Threaded_worker(threading.Thread):

    def callback(self, ch, method, properties, domain):
        url = 'http://www.checkdomain.com/cgi-bin/checkdomain.pl?domain=' + domain
        self.parse_checkdomain(url, domain)
        time.sleep(domain.count('.'))
        ch.basic_ack(delivery_tag=method.delivery_tag)

    def __init__(self):
        threading.Thread.__init__(self)
        self.connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
        self.channel = self.connection.channel()
        self.channel.queue_declare(queue='expired')
        self.channel.basic_qos(prefetch_count=1)
        self.channel.basic_consume(self.callback, queue='expired')

    def run(self):
        logging.warning('Worker Start !')
        self.channel.start_consuming()


for _ in range(15):
    td = Threaded_worker()
    td.setDaemon(False)
    td.start()

顺便说一下，我有一个小问题，如果我的receiver.py没有启动，所有数据仍然保存在队列中？

RabbitMQ：一个接收者的多个发送者

0 个答案: