在Spring Boot应用程序中,我使用了 Kafka 和 Spark ,其中Spark从Kafka读取流,转换数据并将结果最终发送到Kafka:
StreamingQuery kafka = scoring
.writeStream()
.format("kafka")
.outputMode(OutputMode.Complete())
.option("kafka.bootstrap.servers", bootstrapServers)
.option("topic", outputTopic)
.option("checkpointLocation", "~/Desktop/checkpoint")
.queryName("urlCounterKafkaStream")
.start();
Spark发送的数据有2个字段(名称,计数)。
在kafka侦听器应用程序上,我实现了以下简单的反序列化器:
public class RSSItemDeserializer extends JsonDeserializer<RSSItemDTO> {
public RSSItemDeserializer() {
super(RSSItemDTO.class);
}
}
并在application.properties上进行设置
spring.kafka.consumer.value-deserializer=com.noname.deserializer.RSSItemDeserializer
但是有序列化异常:
org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition urlCounterStream-0 at offset 0. If needed, please seek past the record to continue consumption.
Caused by: org.apache.kafka.common.errors.SerializationException: Can't deserialize data [[104, 116, 116, 112, 115, 58, 47, 47, 119, 119, 119, 46, 48, 53, 53, 50, 46, 117, 97]] from topic [urlCounterStream]
Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'https': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"https://www.0552.ua"; line: 1, column: 7]
我错过了什么吗?如何解决该问题并反序列化数据?
谢谢!
答案 0 :(得分:0)
我的问题是我假设spark默认将数据发送为json。这种情况的解决方案是在结果上使用toJSON()
方法,然后将其发送给kafka作为
StreamingQuery kafka = scoring.toJSON()
.writeStream()
...
也许对某人会有所帮助。