我希望有人可以向我解释这种行为,因为它看起来很意外。
我看到除了在Spout上调用“fail”之外,元组的超时没有任何作用。元组本身仍将通过拓扑处理,除了acking / failing将不起作用。另一个问题是待处理元组的数量会增加 - 即使它们将流经拓扑,元组也不会被视为待处理。除非我遗漏了一些东西,否则这两个组合问题最多会使元组超时。完全没有意义,最糟糕的是非常有问题(因为它会将更多的元组扔进已经最大化的拓扑中)。
这是我的拓扑结构:
我希望得到1或2个元组,并且4或3个元组超时 - 然后Bolt将接下来处理重新发送的元组。随着时间的推移,会有越来越多的元组被攻击(尽管它们会经常超时)。
我所看到的是,即使元组超时,它们仍然被博尔特处理。我假设Bolt有缓冲区/队列,并且没有从中清除超时元组。无论如何,这会导致所有元组超时,因为Bolt最终只会处理已经超时的元组。
我假设,并希望,我在这里遗漏了一些明显的东西......
两个问题:
感谢。
脱粒机:
public class SampleSpout extends BaseRichSpout {
private static Logger logger = LoggerFactory.getLogger(SampleSpout.class);
SpoutOutputCollector collector;
Map<Integer, List<Object>> pending_map = new HashMap<Integer, List<Object>>();
Queue<List<Object>> replay_queue = new LinkedBlockingQueue<List<Object>>();
int contentCounter;
int curMsgId;
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// unique-id always increments each time we emit.
// msg-id gets incremented only when new tuples are created.
declarer.declare(new Fields("msg-id", "content"));
}
@Override
public void open(Map conf, TopologyContext context, SpoutOutputCollector spoutOutputCollector) {
collector = spoutOutputCollector;
}
@Override
public void nextTuple() {
// either replay a failed tuple, or create a new one
List<Object> tuple = null;
if (replay_queue.size() > 0){
tuple = replay_queue.poll();
}else{
tuple = new ArrayList<Object>();
tuple.add(null);
tuple.add("Content #" + contentCounter++);
}
// increment msgId and set it as the first item in the tuple
int msgId = this.curMsgId++;
tuple.set(0, msgId);
logger.info("Emitting: " + tuple);
// add this tuple to the 'pending' map, and emit it.
pending_map.put(msgId, tuple);
collector.emit(tuple, msgId);
Utils.sleep(100);
}
@Override
public void ack(Object msgId){
// remove tuple from pending_map since it's no longer pending
List<Object> acked_tuple = pending_map.remove(msgId);
logger.info("Acked: " + acked_tuple);
}
@Override
public void fail(Object msgId){
// remove tuple from pending_map since it's no longer pending
List<Object> failed_tuple = pending_map.remove(msgId);
logger.info("Failed: " + failed_tuple);
// put a copy into the replay queue
ArrayList<Object> copy = new ArrayList<Object>(failed_tuple);
replay_queue.add(copy);
}
}
螺栓:
public class SamplePrintBolt extends BaseRichBolt {
private static Logger logger = LoggerFactory.getLogger(SamplePrintBolt.class);
OutputCollector collector;
@Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector outputCollector) {
collector = outputCollector;
}
@Override
public void execute(Tuple input) {
logger.info("I see: " + input.getValues());
Utils.sleep(4000);
logger.info("Done sleeping. Acking: " + input.getValues());
collector.ack(input);
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// doesn't emit
}
}
主:
public static void main(String[] args) throws Exception {
Config conf = new Config();
conf.setMaxSpoutPending(5);
conf.setMessageTimeoutSecs(5);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new SampleSpout());
builder.setBolt("bolt1", new SamplePrintBolt()).shuffleGrouping("spout");
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("local", conf, builder.createTopology());
}
输出:
30084 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [0, Content #0]
30085 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [0, Content #0]. Will now sleep...
30097 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [1, Content #1]
30097 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [2, Content #2]
30097 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [3, Content #3]
30097 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [4, Content #4]
34086 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [0, Content #0]
34086 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [1, Content #1]. Will now sleep...
34087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Acked: [0, Content #0]
34087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [5, Content #5]
38087 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [1, Content #1]
38087 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [2, Content #2]. Will now sleep...
38089 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Acked: [1, Content #1]
38089 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [6, Content #6]
-- So far, so good… however, now it's time for things to timeout.
40082 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [5, Content #5]
40082 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [4, Content #4]
40082 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [3, Content #3]
40083 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [2, Content #2]
40083 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [7, Content #5]
40084 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [8, Content #4]
40084 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [9, Content #3]
40085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [10, Content #2]
42088 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [2, Content #2]
-- Acking a timed-out tuple… this does nothing.
42088 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [3, Content #3]. Will now sleep…
-- Why is it looking at tuple #3? This has already failed.
45084 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [6, Content #6]
45085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [11, Content #6]
46089 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [3, Content #3]
46089 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [4, Content #4]. Will now sleep...
50084 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [10, Content #2]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [7, Content #5]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [8, Content #4]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [9, Content #3]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [12, Content #2]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [13, Content #5]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [14, Content #4]
50085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [15, Content #3]
-- More timeouts…
50090 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [4, Content #4]
50090 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [5, Content #5]. Will now sleep...
54091 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [5, Content #5]
-- Yet the Bolt looks at tuple #5 which timed out 15 seconds ago…
54091 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [6, Content #6]. Will now sleep...
55085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [11, Content #6]
55085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [16, Content #6]
58091 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [6, Content #6]
58092 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [7, Content #5]. Will now sleep...
60085 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [15, Content #3]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [12, Content #2]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [13, Content #5]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [14, Content #4]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [17, Content #3]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [18, Content #2]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [19, Content #5]
60086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [20, Content #4]
62093 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [7, Content #5]
62093 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [8, Content #4]. Will now sleep…
-- It's clear that the Bolt looks at tuples even if they have timed-out. It's queue will get longer and longer and tuples will always timeout.
65086 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [16, Content #6]
65087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [21, Content #6]
66094 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [8, Content #4]
66094 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [9, Content #3]. Will now sleep...
70087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [20, Content #4]
70087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [19, Content #5]
70087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [18, Content #2]
70088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [17, Content #3]
70088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [22, Content #4]
70088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [23, Content #5]
70088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [24, Content #2]
70088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [25, Content #3]
70095 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [9, Content #3]
70095 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [10, Content #2]. Will now sleep...
74096 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [10, Content #2]
74096 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [11, Content #6]. Will now sleep...
75088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [21, Content #6]
75088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [26, Content #6]
78097 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [11, Content #6]
78097 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [12, Content #2]. Will now sleep...
80087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [25, Content #3]
80087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [24, Content #2]
80087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [23, Content #5]
80087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [22, Content #4]
80087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [27, Content #3]
80087 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [28, Content #2]
80088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [29, Content #5]
80088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [30, Content #4]
82098 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [12, Content #2]
82098 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [13, Content #5]. Will now sleep...
85088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [26, Content #6]
85088 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [31, Content #6]
86098 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [13, Content #5]
86099 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [14, Content #4]. Will now sleep...
90100 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [14, Content #4]
90101 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [15, Content #3]. Will now sleep...
90216 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [29, Content #5]
90216 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [30, Content #4]
90216 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [28, Content #2]
90217 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Failed: [27, Content #3]
90217 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [32, Content #5]
90217 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [33, Content #4]
90217 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [34, Content #2]
90217 [Thread-10-spout] INFO com.appnexus.bsg.billing.storminator.SampleSpout - Emitting: [35, Content #3]
94101 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - Done sleeping. Acking: [15, Content #3]
94101 [Thread-8-bolt1] INFO com.appnexus.bsg.billing.storminator.SamplePrintBolt - I see: [16, Content #6]. Will now sleep…
-- Problem gets exacerbated… Bolt is now looking at tuples that have failed 30 seconds ago.
答案 0 :(得分:3)
“超时”是一个捕捉任何失踪元组的功能 - 要么是发送给已经死亡的工人,要么是Bolt根本没有发生或失败。
因此,“topology.message.timeout.secs”应该设置为您认为拓扑永远不会超过的某个值,但这也是您需要处理多长时间的要求。如果你的价值太低,你就有可能超时仍然“活着”的元组,并堵塞你的系统(就像在原帖中一样)。如果您的值太高,则可能需要等待太长时间才能重新处理失败的元组(同样,这取决于您的要求)。
在超时“实时”元组的情况下,一种解决方案是在所有系统之间同步时钟,在每个元组上发出时间戳,并让每个单独的Bolt简单地传递太旧的元组,从而假定为时间进行。
在我的拙见中,这种行为应该由Storm本身来处理。 Storm可以使用所有元组发送(spout,timestamp)数据,并且在每批超时时,只需告诉每个Bolt失败的任何旧版本(spout,timeout-timestamp) - 比较足够快,并且Bolts可以转储(spout) ,timeout-timestamp)数据一旦遇到来自spout的元组就足够新了。