我有一个从Kafka主题读取数据的数据流。有一个KeyBy,我计算了两个Map函数的不同时间。因此,我知道KeyBy会花费很多时间。实际上,当我阅读Kafka Topic的一行时,我必须在十毫秒内处理它。代码如下。
DataStream<Tuple2<Long, String>> transactionTupleDS =
getOnlineTransactionsKafka(streamEnv, properties);
DataStream<CardTransaction> cardTransactionsDS = transactionTupleDS.map(new MapFunction<Tuple2<Long, String>, CardTransaction>() {
@Override
public CardTransaction map(Tuple2<Long, String> longStringTuple2) throws Exception {
return new CardTransaction(System.currentTimeMillis(),longStringTuple2.f1,ctgmap.get(longStringTuple2.f1));
}
}).keyBy(new KeySelector<CardTransaction, Integer>() {
@Override
public Integer getKey(CardTransaction cardTransaction) throws Exception {
return cardTransaction.groupID;
}
});
DataStream<ExtractedTransaction> extractedTransactionDS = cardGroupIDsDS.connect(cardTransactionsDS).flatMap(new RichCoFlatMapFunction<Tuple2<Integer, Boolean>, CardTransaction, ExtractedTransaction>() {
@Override
public void flatMap1(Tuple2<Integer, Boolean> integerBooleanTuple2, Collector<ExtractedTransaction> collector) throws Exception {
}
@Override
public void flatMap2(CardTransaction cardTransaction, Collector<ExtractedTransaction> collector) throws Exception {
System.out.println(System.currentTimeMillis()-cardTransaction.ingestionTime);
}
});
您能不能花点时间改善我的代码?