概述:我是想要在Storm / Kafka / Flink / MS Azure SA / Spark上运行一些性能测试(WordCount)的学生。我想使用Kafka Broker作为输入源。
我使用了Storm-Starter项目中的WordCount示例,并添加了Kafka作为鲸鱼喷水:
public class WordCountKafkaTopology {
public static class SplitSentence extends ShellBolt implements IRichBolt {
public SplitSentence() {
super("python", "splitsentence.py");
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
@Override
public Map<String, Object> getComponentConfiguration() {
return null;
}
}
public static class WordCount extends BaseBasicBolt {
Map<String, Integer> counts = new HashMap<String, Integer>();
@Override
public void execute(Tuple tuple, BasicOutputCollector collector) {
String word = tuple.getString(0);
Integer count = counts.get(word);
if (count == null)
count = 0;
count++;
counts.put(word, count);
collector.emit(new Values(word, count));
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word", "count"));
}
}
public static void main(String[] args) {
String zkIp = "localhost";
String topicName = "perfTest";
List<String> nimbus_seeds = new ArrayList<String>();
nimbus_seeds.add("localhost");
String zookeeperHost = zkIp +":2181";
ZkHosts zkHosts = new ZkHosts(zookeeperHost);
SpoutConfig kafkaConfig = new SpoutConfig(zkHosts, topicName, "/" + topicName, topicName);
kafkaConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
KafkaSpout kafkaSpout = new KafkaSpout(kafkaConfig);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("kafkaPerfTestSpout", kafkaSpout, 8);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("kafkaPerfTestSpout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
Config config = new Config();
config.setMaxTaskParallelism(5);
config.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 2);
config.put(Config.NIMBUS_SEEDS, nimbus_seeds);
config.put(Config.NIMBUS_THRIFT_PORT, 6627);
config.put(Config.STORM_ZOOKEEPER_PORT, 2181);
config.put(Config.STORM_ZOOKEEPER_SERVERS, Arrays.asList(zkIp));
try {
StormSubmitter.submitTopology("my-kafka-topology", config, builder.createTopology());
} catch (Exception e) {
throw new IllegalStateException("Couldn't initialize the topology", e);
}
}
}
通过运行topolgy,我得到了几条错误消息。鲸鱼喷水说:
位于kafka.consumer.FetchRequestAndResponseMetrics.newTimer(FetchRequestAndResponseStats.scala:26)的kafka.metrics.KafkaMetricsGroup $ class.newTimer(KafkaMetricsGroup.scala:89)中的java.lang.ExceptionInInitializerError位于kafka.consumer.FetchRequestAndResponseMetrics。(FetchRequestAndResponseStats。 scala:35)at kafka.consumer.FetchRequestAndResponseStats。(FetchRequestAndResponseStats.scala:47)at kafka.consumer.FetchRequestAndResponseStatsRegistry $$ anonfun $ 2.apply(FetchRequestAndResponseStats.scala:60)at kafka.consumer.FetchRequestAndResponseStatsRegistry $$ anonfun $ 2.apply( FetchRequestAndResponseStats.scala:60)位于kafka.consumer.Fetch.AndResponseStatsRegistry的kafka.utils.Pool.getAndMaybePut(Pool.scala:59)位于kafka.consumer.SimpleConsumer的$ .getFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:64)。(SimpleConsumer.scala:44 )at ork.apache.storm.kafka.DynamicPartitionConnections.register(DynamicPartitionConn)中的kafka.javaapi.consumer.SimpleConsumer。(SimpleConsumer.scala:34) ections.java:60)org.apache.storm.kafka.PartitionManager。(PartitionManager.java:74)位于org.apache.storm的org.apache.storm.kafka.ZkCoordinator.refresh(ZkCoordinator.java:98)。位于org.apache.storm.kafka.KafkaSpout.nextTuple(KafkaSpout:129)的kafka.ZkCoordinator.getMyManagedPartitions(ZkCoordinator.java:69)org.apache.storm.daemon.executor $ fn__7990 $ fn__8005 $ fn__8036.invoke( executor.clj:648)atg.apache.storm.util $ async_loop $ fn__624.invoke(util.clj:484)at java.lang.Thread.run的clojure.lang.AFn.run(AFn.java:22) (Thread.java:745)引起:java.lang.IllegalStateException:在java.lang.Runtime.addShutdownHook(Runtime.java:211)的java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:66)中正在关机com.yammer.metrics.Metrics。(Metrics.java:21)... 19更多
分裂螺栓:
java.lang.RuntimeException:java.lang.RuntimeException:java.lang.RuntimeException:pid:3973,name:split exitCode:0,errorString:at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java :464)org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430)org.apache.storm.disruptor $ consume_batch_when_available.invoke(disruptor.clj:73)org.apache.storm.daemon。执行者$ fn__8058 $ fn__8071 $ fn__8124.invoke(executor.clj:850)atg.apache.storm.util $ async_loop $ fn__624.invoke(util.clj:484)at clojure.lang.AFn.run(AFn.java: 22)at java.lang.Thread.run(Thread.java:745)引起:java.lang.RuntimeException:java.lang.RuntimeException:pid:3973,name:split exitCode:0,errorString:at org.apache。 storm.task.ShellBolt.execute(ShellBolt.java:150)org.apache.storm.daemon.executor $ fn__8058 $ tuple_action_fn__8060.invoke(executor.clj:731)at org.apache.storm.daemon.executor $ mk_task_receiver $ fn__7979.invoke(executor.clj:464)在org.apache.stor m.disruptor $ clojure_handler $ reify__7492.onEvent(disruptor.clj:40)at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451)... 6更多引起:java.lang.RuntimeException:pid :3973,名称:split exitCode:0,errorString:at org.apache.storm.task.ShellBolt.die(ShellBolt.java:295)at org.apache.storm.task.ShellBolt.access $ 400(ShellBolt.java:70 )org.apache.storm.task.ShellBolt $ BoltWriterRunnable.run(ShellBolt.java:398)... 1更多引起:java.io.IOException:java.io.FileOutputStream.writeBytes(本机方法)中的管道损坏位于java.io.FileOutputStream.write(FileOutputStream.java:326)的java.io.BuredOutputStream.flushBuffer(BufferedOutputStream.java:82)位于sun.nio的java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)。 cs.StreamEncoder.implFlush(StreamEncoder.java:297)位于java.io.BufferedWriter的java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)的sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141) .flush(BufferedWriter.java :254)org.apache.stulti.JultiSerializer.writeString(JsonSerializer.java:99)org.apache.stulti.JsonSerializer.writeMessage(JsonSerializer.java:93)org.apache.storm.multilang上的org.apache.storm.multilang.JsonSerializer.writeString(JsonSerializer.java:99)。位于org.apache.storm.stask.stask.taskBolt $ BoltWriterRunnable.run(ShellBolt.java:387)的org.apache.storm.utils.ShellProcess.writeBoltMsg(ShellProcess.java:127)中的JsonSerializer.writeBoltMsg(JsonSerializer.java:78) ......还有1个
我使用kafka-console-producer来生成一些消息。我希望有一个人可以帮助我。我是编程风暴中的新手...
答案 0 :(得分:0)
删除&#34; config.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS,2);&#34;做完了!