在具有1个喷口和1个螺栓的单个节点机器上可以处理的消息的理想值是多少?什么是提高风暴拓扑处理速度的可能方法?。
更新: 这是示例代码,它拥有RabbitMQ和cassandra的代码,但是会出现相同的性能问题。
// Topology Class
public class SimpleTopology {
public static void main(String[] args) throws InterruptedException {
System.out.println("hiiiiiiiiiii");
TopologyBuilder topologyBuilder = new TopologyBuilder();
topologyBuilder.setSpout("SimpleSpout", new SimpleSpout());
topologyBuilder.setBolt("SimpleBolt", new SimpleBolt(), 2).setNumTasks(4).shuffleGrouping("SimpleSpout");
Config config = new Config();
config.setDebug(true);
config.setNumWorkers(2);
LocalCluster localCluster = new LocalCluster();
localCluster.submitTopology("SimpleTopology", config, topologyBuilder.createTopology());
Thread.sleep(2000);
}
}
// Simple Bolt
public class SimpleBolt implements IRichBolt{
private OutputCollector outputCollector;
public void prepare(Map map, TopologyContext tc, OutputCollector oc) {
this.outputCollector = oc;
}
public void execute(Tuple tuple) {
this.outputCollector.ack(tuple);
}
public void cleanup() {
// TODO
}
public void declareOutputFields(OutputFieldsDeclarer ofd) {
// TODO
}
public Map<String, Object> getComponentConfiguration() {
return null;
}
}
// Simple Spout
public class SimpleSpout implements IRichSpout{
private SpoutOutputCollector spoutOutputCollector;
private boolean completed = false;
private static int i = 0;
public void open(Map map, TopologyContext tc, SpoutOutputCollector soc) {
this.spoutOutputCollector = soc;
}
public void close() {
// Todo
}
public void activate() {
// Todo
}
public void deactivate() {
// Todo
}
public void nextTuple() {
if(!completed)
{
if(i < 100000)
{
String item = "Tag" + Integer.toString(i++);
System.out.println(item);
this.spoutOutputCollector.emit(new Values(item), item);
}
else
{
completed = true;
}
}
else
{
try {
Thread.sleep(2000);
} catch (InterruptedException ex) {
Logger.getLogger(SimpleSpout.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
public void ack(Object o) {
System.out.println("\n\n OK : " + o);
}
public void fail(Object o) {
System.out.println("\n\n Fail : " + o);
}
public void declareOutputFields(OutputFieldsDeclarer ofd) {
ofd.declare(new Fields("word"));
}
public Map<String, Object> getComponentConfiguration() {
return null;
}
}
更新: 是否有可能通过随机分组相同的元组将被处理多次?使用的配置(喷口= 4.螺栓= 4),现在的问题是,随着螺栓数量的增加,性能下降。
答案 0 :(得分:4)
你应该找出这里的瓶颈 - RabbitMQ或Cassandra。打开Storm UI并查看每个组件的延迟时间。
如果增加并行性没有帮助(通常应该这样),那么RabbitMQ或Cassandra肯定存在问题,所以你应该专注于它们。
答案 1 :(得分:2)
在你的代码中,每次调用nextTuple()时只会发出一个元组。尝试每次调用发出更多元组。
类似的东西:
public void nextTuple() {
int max = 1000;
int count = 0;
GetResponse response = channel.basicGet(queueName, autoAck);
while ((response != null) && (count < max)) {
// process message
spoutOutputCollector.emit(new Values(item), item);
count++;
response = channel.basicGet(queueName, autoAck);
}
try { Thread.sleep(2000); } catch (InterruptedException ex) {
}
答案 2 :(得分:0)
我们成功使用RabbitMQ和Storm。结果存储在不同的DB中,但无论如何。我们首先在Spout中使用了basic_get,并且表现糟糕,但随后我们开始使用basic_consume,性能实际上非常好。那么看看你如何消费来自Rabbit的消息。 一些重要因素: