在Kafka流上循环应用多个过滤器+写入多个主题

时间:2018-08-21 17:29:50

标签: apache-kafka apache-kafka-streams kafka-producer-api

我有一个过滤器列表(其中schema_field ='val')和相应主题的要求。我需要遍历那些过滤器列表并应用它们,然后使用KStreams将过滤后的记录值写入其特定主题。有功能吗?

示例:

synchronized (subscriberFilterRequirements) {
    Iterator<SubscriberFilterRequirements> itr = subscriberFilterRequirements.iterator();
    while (itr.hasNext()) {
        SubscriberFilterRequirements req = itr.next();
        log.info("*** Applying transformations on record");
        KStream<String, GenericRecord> subscriberFilteredRecord = filteredRecord;
        if (req.getPipelineSubscriptions().getFiltersql() != null && !req.getPipelineSubscriptions().getFiltersql().isEmpty()) {
            subscriberFilteredRecord = filteredRecord.filter((key, value) -> {
                String[] filter = req.getPipelineSubscriptions().getFiltersql().trim().split("=");
                return value.get(filter[0]).toString().equalsIgnoreCase(filter[1]);
            })
         }
        Schema schema = Utils.getAvroSchema(req.getPipelineSubscriptions().getSubscriberSchemaLocation(),
                    req.getPipelineSubscriptions().getSubscriberSchemaLocationType());
        GenericRecord sinkRecord = new GenericData.Record(schema);
        List<Schema.Field> schemaFieldsList = schema.getFields();
        Iterator<Schema.Field> sinkIterator = schemaFieldsList.iterator();
        subscriberFilteredRecord.map((key, value) -> {
            fillAvroRecord(sinkRecord, sinkIterator, value);
            return new KeyValue<>(key, sinkRecord);
        }).to(req.getPipelineSubscriptions().getKafkaTopic());
    }
}

当前,正在发生的事情是,循环的上下文和KStream的上下文不相同。当开始流式传输时,循环会在第一次执行良好,即KStream接收第一个过滤器,从那时起,KStream就像无限循环一样运行,而没有使用第二个过滤器。我想注入其余的过滤器,一个接一个地应用到记录中。

2 个答案:

答案 0 :(得分:1)

假设您有3个过滤谓词p1p2p3,您可以这样做:

KStream stream = ...
stream.filter(p1).to("output-1");
stream.filter(p2).to("output-2");
stream.filter(p3).to("output-3");

// or as a loop
Predicate[] predicate = new Predicate[]{p1,p2,p3};
String[] outputTopic = new String[]{"output-1","output-2","output-3"};
for(int i = 0; i < 3; ++i) {
    stream.filter(predicate[i]).to(outputTopic[i]);
}

如果您有predicate-outputTopic-pairs集合,这也应该通过foreach()和lambda表达式来工作。

答案 1 :(得分:0)

我想您需要在{% for order in orders %} <tr> <td>{{order.email}}</td> <td><a href="{{ url_for('admin',order_id=order.id, action='delete') }}"><i class="fa fa-trash" aria-hidden="true"></i></a></td> </tr> {% endfor %} 上使用branch方法,并使用多个谓词(过滤器),如下所示:

KStream