超时元组的重要性是什么?

时间:2014-10-23 20:08:24

标签: apache-storm

我希望有人可以向我解释这种行为,因为它看起来很意外。

我看到除了在Spout上调用“fail”之外,元组的超时没有任何作用。元组本身仍将通过拓扑处理,除了acking / failing将不起作用。另一个问题是待处理元组的数量会增加 - 即使它们将流经拓扑,元组也不会被视为待处理。除非我遗漏了一些东西,否则这两个组合问题最多会使元组超时。完全没有意义,最糟糕的是非常有问题(因为它会将更多的元组扔进已经最大化的拓扑中)。

这是我的拓扑结构:

  • 我有一个鲸鱼嘴。在nextTuple上,它重新发出一个失败的元组,如果没有,则创建一个新的元组。
  • 我有一个螺栓需要4秒才能确定元组。
  • topology.max.spout.pending = 5
  • topology.message.timeout.secs = 5

我希望得到1或2个元组,并且4或3个元组超时 - 然后Bolt将接下来处理重新发送的元组。随着时间的推移,会有越来越多的元组被攻击(尽管它们会经常超时)。

我所看到的是,即使元组超时,它们仍然被博尔特处理。我假设Bolt有缓冲区/队列,并且没有从中清除超时元组。无论如何,这会导致所有元组超时,因为Bolt最终只会处理已经超时的元组。

我假设,并希望,我在这里遗漏了一些明显的东西......

两个问题:

  1. 我可以阻止Bolts处理已经超时的元组吗?
  2. 超时元组的重点是什么?它只会在Spout上调用失败,即使元组仍将由拓扑的其余部分处理!
  3. 感谢。

    脱粒机:

    public class SampleSpout extends BaseRichSpout {
        private static Logger logger = LoggerFactory.getLogger(SampleSpout.class);
    
        SpoutOutputCollector collector;
        Map<Integer, List<Object>> pending_map = new HashMap<Integer, List<Object>>();
        Queue<List<Object>> replay_queue = new LinkedBlockingQueue<List<Object>>();
    
        int contentCounter;
        int curMsgId;
    
        @Override
        public void declareOutputFields(OutputFieldsDeclarer declarer) {
            // unique-id always increments each time we emit.
            // msg-id gets incremented only when new tuples are created.
           declarer.declare(new Fields("msg-id", "content"));
        }
    
        @Override
        public void open(Map conf, TopologyContext context, SpoutOutputCollector spoutOutputCollector) {
            collector = spoutOutputCollector;
        }
    
        @Override
        public void nextTuple() {
            // either replay a failed tuple, or create a new one
            List<Object> tuple = null;
            if (replay_queue.size() > 0){
                tuple = replay_queue.poll();
            }else{
                tuple = new ArrayList<Object>();
                tuple.add(null);
                tuple.add("Content #" + contentCounter++);
            }
    
            // increment msgId and set it as the first item in the tuple
            int msgId = this.curMsgId++;
            tuple.set(0, msgId);
            logger.info("Emitting: " + tuple);
            // add this tuple to the 'pending' map, and emit it.
            pending_map.put(msgId, tuple);
            collector.emit(tuple, msgId);
            Utils.sleep(100);
        }
    
        @Override
        public void ack(Object msgId){
            // remove tuple from pending_map since it's no longer pending
            List<Object> acked_tuple = pending_map.remove(msgId);
            logger.info("Acked: " + acked_tuple);
        }
    
        @Override
        public void fail(Object msgId){
            // remove tuple from pending_map since it's no longer pending
            List<Object> failed_tuple = pending_map.remove(msgId);
            logger.info("Failed: " + failed_tuple);
    
            // put a copy into the replay queue
            ArrayList<Object> copy = new ArrayList<Object>(failed_tuple);
            replay_queue.add(copy);
        }
    }
    

    螺栓:

    public class SamplePrintBolt extends BaseRichBolt {
    
        private static Logger logger = LoggerFactory.getLogger(SamplePrintBolt.class);
    
        OutputCollector collector;
    
        @Override
        public void prepare(Map stormConf, TopologyContext context, OutputCollector outputCollector) {
            collector = outputCollector;
        }
    
        @Override
        public void execute(Tuple input) {
            logger.info("I see: " + input.getValues());
            Utils.sleep(4000);
            logger.info("Done sleeping. Acking: "  + input.getValues());
            collector.ack(input);
        }
    
        @Override
        public void declareOutputFields(OutputFieldsDeclarer declarer) {
            // doesn't emit
        }
    }
    

    主:

    public static void main(String[] args) throws Exception {
            Config conf = new Config();
            conf.setMaxSpoutPending(5);
            conf.setMessageTimeoutSecs(5);
    
            TopologyBuilder builder = new TopologyBuilder();
            builder.setSpout("spout", new SampleSpout());
            builder.setBolt("bolt1", new SamplePrintBolt()).shuffleGrouping("spout");
    
            LocalCluster cluster = new LocalCluster();
            cluster.submitTopology("local", conf, builder.createTopology());
    }
    

    输出:

    30084 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [0, Content #0]
    30085 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [0, Content #0]. Will now sleep...
    30097 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [1, Content #1]
    30097 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [2, Content #2]
    30097 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [3, Content #3]
    30097 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [4, Content #4]
    34086 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [0, Content #0]
    34086 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [1, Content #1]. Will now sleep...
    34087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Acked: [0, Content #0]
    34087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [5, Content #5]
    38087 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [1, Content #1]
    38087 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [2, Content #2]. Will now sleep...
    38089 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Acked: [1, Content #1]
    38089 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [6, Content #6]
    -- So far, so good… however, now it's time for things to timeout.
    40082 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [5, Content #5]
    40082 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [4, Content #4]
    40082 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [3, Content #3]
    40083 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [2, Content #2]
    40083 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [7, Content #5]
    40084 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [8, Content #4]
    40084 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [9, Content #3]
    40085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [10, Content #2]
    42088 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [2, Content #2]
    -- Acking a timed-out tuple… this does nothing.
    42088 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [3, Content #3]. Will now sleep…
    -- Why is it looking at tuple #3?  This has already failed.
    45084 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [6, Content #6]
    45085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [11, Content #6]
    46089 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [3, Content #3]
    46089 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [4, Content #4]. Will now sleep...
    50084 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [10, Content #2]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [7, Content #5]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [8, Content #4]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [9, Content #3]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [12, Content #2]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [13, Content #5]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [14, Content #4]
    50085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [15, Content #3]
    -- More timeouts…
    50090 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [4, Content #4]
    50090 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [5, Content #5]. Will now sleep...
    54091 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [5, Content #5]
    -- Yet the Bolt looks at tuple #5 which timed out 15 seconds ago…
    54091 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [6, Content #6]. Will now sleep...
    55085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [11, Content #6]
    55085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [16, Content #6]
    58091 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [6, Content #6]
    58092 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [7, Content #5]. Will now sleep...
    60085 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [15, Content #3]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [12, Content #2]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [13, Content #5]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [14, Content #4]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [17, Content #3]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [18, Content #2]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [19, Content #5]
    60086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [20, Content #4]
    62093 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [7, Content #5]
    62093 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [8, Content #4]. Will now sleep…
    -- It's clear that the Bolt looks at tuples even if they have timed-out.  It's queue will get longer and longer and tuples will always timeout.
    65086 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [16, Content #6]
    65087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [21, Content #6]
    66094 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [8, Content #4]
    66094 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [9, Content #3]. Will now sleep...
    70087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [20, Content #4]
    70087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [19, Content #5]
    70087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [18, Content #2]
    70088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [17, Content #3]
    70088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [22, Content #4]
    70088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [23, Content #5]
    70088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [24, Content #2]
    70088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [25, Content #3]
    70095 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [9, Content #3]
    70095 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [10, Content #2]. Will now sleep...
    74096 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [10, Content #2]
    74096 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [11, Content #6]. Will now sleep...
    75088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [21, Content #6]
    75088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [26, Content #6]
    78097 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [11, Content #6]
    78097 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [12, Content #2]. Will now sleep...
    80087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [25, Content #3]
    80087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [24, Content #2]
    80087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [23, Content #5]
    80087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [22, Content #4]
    80087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [27, Content #3]
    80087 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [28, Content #2]
    80088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [29, Content #5]
    80088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [30, Content #4]
    82098 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [12, Content #2]
    82098 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [13, Content #5]. Will now sleep...
    85088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [26, Content #6]
    85088 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [31, Content #6]
    86098 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [13, Content #5]
    86099 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [14, Content #4]. Will now sleep...
    90100 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [14, Content #4]
    90101 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [15, Content #3]. Will now sleep...
    90216 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [29, Content #5]
    90216 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [30, Content #4]
    90216 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [28, Content #2]
    90217 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [27, Content #3]
    90217 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [32, Content #5]
    90217 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [33, Content #4]
    90217 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [34, Content #2]
    90217 [Thread-10-spout] INFO  com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [35, Content #3]
    94101 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [15, Content #3]
    94101 [Thread-8-bolt1] INFO  com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [16, Content #6]. Will now sleep…
    -- Problem gets exacerbated…  Bolt is now looking at tuples that have failed 30 seconds ago.
    

1 个答案:

答案 0 :(得分:3)

“超时”是一个捕捉任何失踪元组的功能 - 要么是发送给已经死亡的工人,要么是Bolt根本没有发生或失败。

因此,“topology.message.timeout.secs”应该设置为您认为拓扑永远不会超过的某个值,但这也是您需要处理多长时间的要求。如果你的价值太低,你就有可能超时仍然“活着”的元组,并堵塞你的系统(就像在原帖中一样)。如果您的值太高,则可能需要等待太长时间才能重新处理失败的元组(同样,这取决于您的要求)。

在超时“实时”元组的情况下,一种解决方案是在所有系统之间同步时钟,在每个元组上发出时间戳,并让每个单独的Bolt简单地传递太旧的元组,从而假定为时间进行。

在我的拙见中,这种行为应该由Storm本身来处理。 Storm可以使用所有元组发送(spout,timestamp)数据,并且在每批超时时,只需告诉每个Bolt失败的任何旧版本(spout,timeout-timestamp) - 比较足够快,并且Bolts可以转储(spout) ,timeout-timestamp)数据一旦遇到来自spout的元组就足够新了。