使用字数统计拓扑,Acked为零

时间:2016-01-20 15:17:21

标签: java apache-storm

我是暴风雨的新手,我提交了风暴启动项目,字数拓扑

我得到了 wordcount

Acked为零!我该如何解决?

任何人的代码链接都不知道项目

https://github.com/nathanmarz/storm-starter/blob/master/src/jvm/storm/starter/WordCountTopology.java

package storm.starter;

import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.task.ShellBolt;
import backtype.storm.topology.BasicOutputCollector;
import backtype.storm.topology.IRichBolt;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.TopologyBuilder;
import backtype.storm.topology.base.BaseBasicBolt;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Tuple;
import backtype.storm.tuple.Values;
import storm.starter.spout.RandomSentenceSpout;

import java.util.HashMap;
import java.util.Map;

/**
 * This topology demonstrates Storm's stream groupings and multilang capabilities.
 */
public class WordCountTopology {
  public static class SplitSentence extends ShellBolt implements IRichBolt {

    public SplitSentence() {
      super("python", "splitsentence.py");
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
      declarer.declare(new Fields("word"));
    }

    @Override
    public Map<String, Object> getComponentConfiguration() {
      return null;
    }
  }

  public static class WordCount extends BaseBasicBolt {
    Map<String, Integer> counts = new HashMap<String, Integer>();

    @Override
    public void execute(Tuple tuple, BasicOutputCollector collector) {
      String word = tuple.getString(0);
      Integer count = counts.get(word);
      if (count == null)
        count = 0;
      count++;
      counts.put(word, count);
      collector.emit(new Values(word, count));
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
      declarer.declare(new Fields("word", "count"));
    }
  }

  public static void main(String[] args) throws Exception {

    TopologyBuilder builder = new TopologyBuilder();

    builder.setSpout("spout", new RandomSentenceSpout(), 5);

    builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
    builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

    Config conf = new Config();
    conf.setDebug(true);


    if (args != null && args.length > 0) {
      conf.setNumWorkers(3);

      StormSubmitter.submitTopology(args[0], conf, builder.createTopology());
    }
    else {
      conf.setMaxTaskParallelism(3);

      LocalCluster cluster = new LocalCluster();
      cluster.submitTopology("word-count", conf, builder.createTopology());

      Thread.sleep(10000);

      cluster.shutdown();
    }
  }
}

1 个答案:

答案 0 :(得分:1)

要在Storm中获取元组,需要做几件事。

  1. 确保通过设置Topology_Ackers_Executors配置来启用ackers。

    //in java
    conf.put(Config.TOPOLOGY_ACKER_EXECUTORS, 2);
    //in storm.yaml
    topology.acker.executors: 2  //defaults to 0
    

    您只需在这两个地方之一中设置配置即可。 storm.yaml是默认值,Java配置可以覆盖storm.yaml中的任何内容。

  2. Bolt中的
  3. Tuple Anchoring

    //short java snippet
    String sentence = tuple.getString(0);
    for(String word: sentence.split(" ")) {
      _collector.emit(tuple, new Values(word)); //anchoring
      _collector.emit(new Values(word));        //not anchoring
    }
    _collector.ack(tuple);
    

    如上面的链接所述。您必须将元组锚定在一起才能启用。在第一个collector.emit中,将新创建的元组new Values(word)锚定到旧元组。但是在第二个collector.emit中,你没有锚定元组。元组需要彼此锚定以便工作。我不知道如何在Python中这样做,所以你必须弄明白。

  4. 你可能还需要做其他事情,这个答案主要来自内存,我还没有测试过你的任何代码。但这应该给你一个起点。如果您有任何问题,请在询问另一个低质量问题之前阅读documentation。这就是我弄清楚它的方法,你也应该学会发展这种技能。