我创建了3个单独的容器:1个用于Kafka,2个用于Porducer(Streamer),最后一个用于使用docker-compose的Consumer
version: "3"
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
networks:
- stream-network
kafka:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
ports:
- 9092:9092
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
networks:
- stream-network
streamer:
build:
context: ./streamingProducer/
networks:
- stream-network
depends_on:
- kafka
consumer:
build:
context: ./streamingConsumer/
networks:
- stream-network
depends_on:
- kafka
我正在容器中的生产者生成10条消息,下面是代码:
from confluent_kafka import Producer
import pprint
from faker import Faker
#from bson.json_util import dumps
import time
def delivery_report(err, msg):
""" Called once for each message produced to indicate delivery result.
Triggered by poll() or flush(). """
if err is not None:
print('Message delivery failed: {}'.format(err))
else:
print('Message delivered to {} [{}]'.format(msg.topic(), msg.partition()))
# Generating fake data
myFactory = Faker()
myFactory.random.seed(5467)
for i in range(10):
data = myFactory.name()
print("data: ", data)
# Produce sample message from localhost
# producer = KafkaProducer(bootstrap_servers=['localhost:9092'], retries=5)
# Produce message from docker
producer = Producer({'bootstrap.servers': 'kafka:29092'})
producer.poll(0)
#producer.send('live-transactions', dumps(data).encode('utf-8'))
producer.produce('mytopic', data.encode('utf-8'))
# block until all async messages are sent
producer.flush()
# tidy up the producer connection
# producer.close()
time.sleep(0.5)
这是以下10条输出消息:
**
streamer_1 | producer.py:35: DeprecationWarning: PY_SSIZE_T_CLEAN will be required for '#' formats
streamer_1 | producer.produce('mytopic', data.encode('utf-8'))
streamer_1 | data: Denise Reed
streamer_1 | data: Megan Douglas
streamer_1 | data: Philip Obrien
streamer_1 | data: William Howell
streamer_1 | data: Michael Williamson
streamer_1 | data: Cheryl Jackson
streamer_1 | data: Janet Bruce
streamer_1 | data: Colton Martin
streamer_1 | data: David Melton
streamer_1 | data: Paula Ingram
**
当我尝试按消费者使用消息时,它仅消耗最后一条消息,在这种情况下: Paula Ingram ,然后程序像无限循环一样永远运行。不知道怎么了。这是使用者的以下代码:
from kafka.consumer import KafkaConsumer
try:
print('Welcome to parse engine')
# From inside a container
#consumer = KafkaConsumer('test-topic', bootstrap_servers='kafka:29092')
# From localhost
consumer = KafkaConsumer('mytopic', bootstrap_servers='localhost:9092', auto_offset_reset='earliest')
for message in consumer:
print('im a message')
print(message.value.decode("utf-8"))
except Exception as e:
print(e)
# Logs the error appropriately.
pass
任何帮助将不胜感激。谢谢。
答案 0 :(得分:0)
我怀疑您遇到消费者群体问题。
auto_offset_reset='earliest'
仅仅适用于不存在的组。如果存在一个组,则该组将从上一个可用位置恢复。
如果不是这种情况,则不清楚您以什么顺序运行消费者和生产者,但是我会先启动消费者,然后docker-compose restart streamer
几次