使用kStream中的lambda连接Avro格式数据

时间:2018-05-28 08:16:43

标签: java apache-kafka apache-kafka-streams confluent-schema-registry

我有两个流:

Stream1: 
[KSTREAM-MAP-0000000004]: 1, {"id": 1, "name": "john", "age": 26}
[KSTREAM-MAP-0000000004]: 2, {"id": 2, "name": "jane", "age": 24}
[KSTREAM-MAP-0000000004]: 3, {"id": 3, "name": "julia", "age": 25}
[KSTREAM-MAP-0000000004]: 4, {"id": 4, "name": "jamie", "age": 22}
[KSTREAM-MAP-0000000004]: 5, {"id": 5, "name": "jenny", "age": 27}

Stream 2:
[KSTREAM-MAP-0000000004]: 1, {"id": 1, "name": "xxx", "age": 26}
[KSTREAM-MAP-0000000004]: 2, {"id": 2, "name": "yyy", "age": 24}
[KSTREAM-MAP-0000000004]: 31, {"id": 3, "name": "zzz", "age": 25}
[KSTREAM-MAP-0000000004]: 41, {"id": 4, "name": "uuu", "age": 22}
[KSTREAM-MAP-0000000004]: 51, {"id": 5, "name": "iii", "age": 27}

现在我想加入两个流,并根据密钥检索流2中不存在的流1字段。

我的例外输出应该如下:

3, {"id": 3, "name": "julia", "age": 25}
4, {"id": 4, "name": "jamie", "age": 22}
5, {"id": 5, "name": "jenny", "age": 27}

我的架构注册表文件:

{"namespace": "schema.avro",
 "type": "record",
 "name": "mysql",
 "fields": [
     {"name": "id", "type": "int", "doc" : "id"},
     {"name": "name", "type": "string", "doc" : "name"},
     {"name": "age", "type": "int", "doc" : "age"}
 ]
}

我尝试以这种方式加入:

final Serde<GenericRecord> genericAvroSerde = new GenericAvroSerde();

KStream<Integer,String> joined1 = psql_data.leftJoin(mysql_data,
    (leftValue, rightValue) ->  "psql_data=" + leftValue + ", mysql_data=" + rightValue,
    JoinWindows.of(TimeUnit.MINUTES.toMillis(1)),
    Joined.with(
      Serdes.Integer(),
      genericAvroSerde,
      genericAvroSerde)
);

但我得到一个例外:

[ERROR] /home/kafka-connect/confluent-4.1.0/kafka_streaming/src/main/java/com/aail/kafka_stream.java:[140,43] error: no suitable method found for leftJoin(KStream<Integer,mysql>,(leftValue[...]Value,JoinWindows,Joined<Integer,GenericRecord,GenericRecord>)
[ERROR] method KStream.<VO#1,VR#1>leftJoin(KStream<Integer,VO#1>,ValueJoiner<? super mysql,? super VO#1,? extends VR#1>,JoinWindows) is not applicable
[ERROR] (cannot infer type-variable(s) VO#1,VR#1
[ERROR] (actual and formal argument lists differ in length))
[ERROR] method KStream.<VO#2,VR#2>leftJoin(KStream<Integer,VO#2>,ValueJoiner<? super mysql,? super VO#2,? extends VR#2>,JoinWindows,Joined<Integer,mysql,VO#2>) is not applicable
[ERROR] (inferred type does not conform to equality constraint(s)
[ERROR] inferred: GenericRecord
[ERROR] equality constraints(s): GenericRecord,mysql)

我想我需要在连接函数中的左右值中给出我的mysql avro文件而不是genericAvroSerde。我尝试过,但我没有得到。有人可以帮忙执行连接操作。

1 个答案:

答案 0 :(得分:1)

您需要在使用前配置GenericAvroSerde

final Serde<GenericRecord> genericAvroSerde = new GenericAvroSerde();
genericAvroSerde.configure(...);

并传递配置,以便它可以找到文档中描述的Confluent Schema Registry:https://docs.confluent.io/current/streams/developer-guide/datatypes.html#avro