我坚持这个问题,我无法弄清楚发生了什么。我正在尝试使用Kafka流来写一个主题的日志。另一方面,我让Kafka-connect进入MySQL的每个条目。所以,基本上我需要的是一个Kafka流程序,它将一个主题作为字符串读取并将其解析为Avro格式,然后将其输入另一个主题。
这是我写的代码:
//Define schema
String userSchema = "{"
+ "\"type\":\"record\","
+ "\"name\":\"myrecord\","
+ "\"fields\":["
+ " { \"name\":\"ID\", \"type\":\"int\" },"
+ " { \"name\":\"COL_NAME_1\", \"type\":\"string\" },"
+ " { \"name\":\"COL_NAME_2\", \"type\":\"string\" }"
+ "]}";
String key = "key1";
Schema.Parser parser = new Schema.Parser();
Schema schema = parser.parse(userSchema);
//Settings
System.out.println("Kafka Streams Demonstration");
//Settings
Properties settings = new Properties();
// Set a few key parameters
settings.put(StreamsConfig.APPLICATION_ID_CONFIG, APP_ID);
// Kafka bootstrap server (broker to talk to); ubuntu is the host name for my VM running Kafka, port 9092 is where the (single) broker listens
settings.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
// Apache ZooKeeper instance keeping watch over the Kafka cluster; ubuntu is the host name for my VM running Kafka, port 2181 is where the ZooKeeper listens
settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "localhost:2181");
// default serdes for serialzing and deserializing key and value from and to streams in case no specific Serde is specified
settings.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
settings.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
settings.put(StreamsConfig.STATE_DIR_CONFIG ,"/tmp");
// to work around exception Exception in thread "StreamThread-1" java.lang.IllegalArgumentException: Invalid timestamp -1
// at org.apache.kafka.clients.producer.ProducerRecord.<init>(ProducerRecord.java:60)
// see: https://groups.google.com/forum/#!topic/confluent-platform/5oT0GRztPBo
// Create an instance of StreamsConfig from the Properties instance
StreamsConfig config = new StreamsConfig(getProperties());
final Serde < String > stringSerde = Serdes.String();
final Serde < Long > longSerde = Serdes.Long();
final Serde<byte[]> byteArraySerde = Serdes.ByteArray();
// building Kafka Streams Model
KStreamBuilder kStreamBuilder = new KStreamBuilder();
// the source of the streaming analysis is the topic with country messages
KStream<byte[], String> instream =
kStreamBuilder.stream(byteArraySerde, stringSerde, "sqlin");
final KStream<byte[], GenericRecord> outstream = instream.mapValues(new ValueMapper<String, GenericRecord>() {
@Override
public GenericRecord apply(final String record) {
System.out.println(record);
GenericRecord avroRecord = new GenericData.Record(schema);
String[] array = record.split(" ", -1);
for (int i = 0; i < array.length; i = i + 1) {
if (i == 0)
avroRecord.put("ID", Integer.parseInt(array[0]));
if (i == 1)
avroRecord.put("COL_NAME_1", array[1]);
if (i == 2)
avroRecord.put("COL_NAME_2", array[2]);
}
System.out.println(avroRecord);
return avroRecord;
}
});
outstream.to("sqlout");
这是我得到空指针异常后的输出:
java -cp streams-examples-3.2.1-standalone.jar io.confluent.examples.streams.sql Kafka Streams Demonstration Start Now started CountriesStreams Example 5 this is {"ID": 5, "COL_NAME_1": "this", "COL_NAME_2": "is"} Exception in thread "StreamThread-1" java.lang.NullPointerException at org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:81) at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:83) at org.apache.kafka.streams.kstream.internals.KStreamMapValues$KStreamMapProcessor.process(KStreamMapValues.java:42) at org.apache.kafka.streams.processor.internals.ProcessorNode$1.run(ProcessorNode.java:48) at org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:188) at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:134) at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:83) at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:70) at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:197) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:627) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)
主题sqlin包含一些消息,其中包含一个数字后跟两个单词。注意两个打印行:该函数获取一条消息,并在捕获空指针之前成功解析它。问题是我是Java,Kafka和Avro的新手,所以我不确定我哪里出错了。我是否设置了Avro架构?或者我使用kstream错了?非常感谢任何帮助。
答案 0 :(得分:1)
我认为问题出在以下几行:
var projectManagerList = gridReader.Read<ProjectResources, Employee,
ProjectResources>((pr, e) =>
{
pr.Employee = e;
return pr;
},splitOn: "Id").ToList();
默认情况下,您的应用程序配置为使用outstream.to("sqlout");
serde来记录密钥和记录值:
String
由于settings.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
settings.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
的类型为outstream
,因此在调用KStream<byte[], GenericRecord>
时必须提供明确的serdes:
to()
仅供参考:Confluent Platform的下一个版本(ETA:本月= 6月 2017)将附带ready-to-use generic + specific Avro serde 与Confluent schema registry集成。这个 应该让你的生活更轻松。
有关详细信息,请参阅https://stackoverflow.com/a/44433098/1743580的答案。