我想实现一个简单的管道,该管道将在5秒的固定窗口中对单词进行计数:
public static void main(String[] args) {
List<String> sentences = Arrays.asList("a b c", "d e f", "c b a", "f e d", "l h f g a", "abc cba");
BaseRichSpout spout = new SproutSentences(sentences);
TridentTopology topology = new TridentTopology();
topology.newStream("sentences", spout)
.peek(new Consumer() {
@Override
public void accept(TridentTuple input) {
System.out.println(String.format("Sentence: %s", input.getStringByField("sentence")));
}
})
.flatMap(new FlatMapFunction() {
@Override
public Iterable<Values> execute(TridentTuple input) {
return Arrays.stream(input.getStringByField("sentence").split(" "))
.map(Values::new)
.collect(Collectors.toList());
}
}, new Fields("word"))
.peek(new Consumer() {
@Override
public void accept(TridentTuple input) {
System.out.println(String.format("Word: %s", input.getStringByField("word")));
}
})
.tumblingWindow(BaseWindowedBolt.Duration.seconds(5), new InMemoryWindowsStoreFactory(), new Fields("word"), new CountAsAggregator(), new Fields("count"))
.peek(new Consumer() {
@Override
public void accept(TridentTuple input) {
System.out.println(String.format("Windowed: %s", input));
}
});
Config config = new Config();
config.setDebug(false);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("Test", config, topology.build());
}
static class SproutSentences extends BaseRichSpout {
private SpoutOutputCollector outputCollector;
private List<String> sentences;
private int i = 0;
public SproutSentences(List<String> sentences) {
this.sentences = sentences;
}
public void open(Map map, TopologyContext topologyContext, SpoutOutputCollector spoutOutputCollector) {
this.outputCollector = spoutOutputCollector;
}
public void nextTuple() {
Utils.sleep(1000);
this.outputCollector.emit(new Values(sentences.get(i++ % sentences.size())), i);
}
public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {
outputFieldsDeclarer.declare(new Fields("sentence"));
}
}
}
问题是窗口结束后我什么都没有得到。我尝试更改窗口持续时间,将其替换为计数持续时间而不是时间,但这也无济于事。
如果我做同样的事情,但是使用香草风暴的api,则所有工作都按预期进行。 我该如何解决?