我正在测试使用kafka的Spark结构化流。我在host28
上有一个 kafka-broker(0.10.1),默认分区编号:num.partitions=1
我的制片人:
bin/kafka-console-producer.sh --broker-list host28:6667 --topic test
当我使用
bin/kafka-console-consumer.sh --zookeeper host26:2181,host27:2181,host28:2181 --topic test --from-beginning
或
bin/kafka-console-consumer.sh --bootstrap-server host8:6667 --topic test --from-beginning --partition 0
可以接收来自kafka的消息。
但是使用时
bin/kafka-console-consumer.sh --bootstrap-server host28:6667 --topic test --from-beginning
或Spark结构化的流媒体无法接收消息
public class Main {
private static String APP_NAE = "test_streaming_from_kafka";
private static String KAFKA_HOST = "host28:6667";
private static String KAFKA_SUBSCRIBE = "test";
public static void main(String[] args) throws Exception {
SparkSession spark = SparkSession
.builder()
.appName(APP_NAE)
.getOrCreate();
DataStreamReader reader = spark
.readStream()
.format("kafka")
.option("kafka.bootstrap.servers", KAFKA_HOST)
.option("subscribe", KAFKA_SUBSCRIBE);
StreamingQuery query = reader.load()
.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)", "topic", "CAST(partition AS STRING)", "CAST(offset AS STRING)")
.writeStream()
.format("console")
.start();
query.awaitTermination();
}
}
答案 0 :(得分:0)
已解决!
我将Spark日志从INFO
更改为DEBUG
,然后我发现了这一点:
18/08/17 21:12:07调试摘要协调员:已接收小组 协调器响应ClientResponse(receivedTimeMs = 1534511527794, 断开连接=否,请求= ClientRequest(expectResponse = true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@3d2afb1b, request = RequestSend(header = {api_key = 10,api_version = 0,correlation_id = 117,client_id = consumer-1}, body = {group_id = spark-kafka-source-f7b2afd9-e1c6-4d16-b299-6d629599cdc8-42875004-driver-0}), createdTimeMs = 1534511527794,sendTimeMs = 1534511527794), responseBody = {error_code = 15,coordinator = {node_id = -1,host =,port = -1}}) 18/08/17 21:12:07 DEBUG AbstractCoordinator:组协调器查找 对于组 spark-kafka-source-f7b2afd9-e1c6-4d16-b299-6d629599cdc8-42875004-driver-0 失败:群组协调器不可用。
google The group coordinator is not available
找到了this