我正在尝试使用 docker-compose 旋转和连接两个容器(mongo 和 scrapy spider)。作为 Docker 的新手,我很难对网络端口(容器内部和外部)进行故障排除。为了尊重您的时间,我会尽量缩短。
问题:
无法将蜘蛛连接到 mongo db 容器并出现超时错误。我认为它与我尝试从容器连接的 IP 地址不正确有关。但是,蜘蛛在本地工作(非 dockerized 版本)并且可以将数据传递给正在运行的 mongo 容器。
从代码中删除姓名和电子邮件的小修改。
错误:
pymongo.errors.ServerSelectionTimeoutError: 127.0.0.1:27017: [Errno 111] Connection refused, Timeout: 30s, Topology Description: <TopologyDescription id: 5feb8bdcf912ec8797c25497, topology_type: Single
管道代码:
from scrapy.exceptions import DropItem
# scrappy log is deprecated
#from scrapy.utils import log
import logging
import scrapy
from itemadapter import ItemAdapter
import pymongo
class xkcdMongoDBStorage:
"""
Class that handles the connection of
Input:
MongoDB
Output
"""
def __init__(self):
# requires two arguments(address and port)
#* connecting to the db
self.conn = pymongo.MongoClient(
'127.0.0.1',27017) # works with spider local and container running
# '0.0.0.0',27017)
# connecting to the db
dbnames = self.conn.list_database_names()
if 'randallMunroe' not in dbnames:
# creating the database
self.db = self.conn['randallMunroe']
#if database already exists we want access
else:
self.db = self.conn.randallMunroe
#* connecting to the table
dbCollectionNames = self.db.list_collection_names()
if 'webComic' not in dbCollectionNames:
self.collection = self.db['webComic']
else:
# the table already exist so we access it
self.collection = self.db.webComic
def process_item(self, item, spider):
valid = True
for data in item:
if not data:
valid = False
raise DropItem("Missing {0}!".format(data))
if valid:
self.collection.insert(dict(item))
logging.info(f"Question added to MongoDB database!")
return item
蜘蛛的Dockerfile
# base image
FROM python:3
# metadata info
LABEL maintainer="first last name" email="something@gmail.com"
# exposing container port to be the same as scrapy default
EXPOSE 6023
# set work directly so that paths can be relative
WORKDIR /usr/src/app
# copy to make usage of caching
COPY requirements.txt ./
#install dependencies
RUN pip3 install --no-cache-dir -r requirements.txt
# copy code itself from local file to image
COPY . .
CMD scrapy crawl xkcdDocker
version: '3'
services:
db:
image: mongo:latest
container_name: NoSQLDB
restart: always
environment:
MONGO_INITDB_ROOT_USERNAME: root
MONGO_INITDB_ROOT_PASSWORD: password
volumes:
- ./data/bin:/data/db
ports:
- 27017:27017
expose:
- 27017
xkcd-scraper:
build: ./scraperDocker
container_name: xkcd-scraper-container
volumes:
- ./scraperDocker:/usr/src/app/scraper
ports:
- 5000:6023
expose:
- 6023
depends_on:
- db
感谢您的帮助
答案 0 :(得分:1)
试试:
self.conn = pymongo.MongoClient('NoSQLDB',27017)
在 docker compose 中,您可以根据服务名称引用其他容器。