我将拓扑从风暴移动到flink。拓扑已减少到KafkaSpout->Bolt
。螺栓只计算数据包而不是尝试解码它们。
已编译的.jar通过flink -c <entry point> <path to .jar>
提交到flink并发出以下错误:
java.lang.Exception: Call to registerInputOutput() of invokable failed
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:529)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot instantiate user function.
at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:190)
at org.apache.flink.streaming.runtime.tasks.StreamTask.registerInputOutput(StreamTask.java:174)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:526)
... 1 more
Caused by: java.io.StreamCorruptedException: unexpected block data
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1365)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:294)
at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:255)
at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:175)
... 3 more
我的问题:
我应该使用风暴 KafkaSpout 还是使用flink KafkaSource编写自己的风格会更好?
编辑:
以下是相关的代码:
BrokerHosts brokerHosts = new ZkHosts(configuration.getString("kafka/zookeeper"));
SpoutConfig kafkaConfig = new SpoutConfig(brokerHosts, configuration.getString("kafka/topic"), "/storm_env_values", "storm_env_DEBUG");
FlinkTopologyBuilder builder = new FlinkTopologyBuilder();
builder.setSpout("environment", new KafkaSpout(kafkaConfig), 1);
builder.setBolt("decode_bytes", new EnvironmentBolt(), 1).shuffleGrouping("environment");
FlinkLocalCluster cluster = new FlinkLocalCluster(); // replaces: LocalCluster cluster = new LocalCluster();
cluster.submitTopology("env_topology", conf, buildTopology());
螺栓基于BaseRichBolt。 execute() fn只记录任何数据包的存在以进行调试。那里没有其他代码。
答案 0 :(得分:1)
I just had look at this. There is one issues right now but I got it working locally. You can apply this hot fixed to your code and build the compatibility layer by yourself.
KafkaSpout
registers metrics. However, metrics are currently not supported by the compatibility layer. You need to remove the exception in FlinkTopologyContext.registerMetric(...)
and just return null
. (There is already a open PR that work on the integration of metrics, thus I don't want to push this hot fix into master branch)I just made up some values here:
Config c = new Config();
List<String> zkServers = new ArrayList<String>();
zkServers.add("localhost");
c.put(Config.STORM_ZOOKEEPER_SERVERS, zkServers);
c.put(Config.STORM_ZOOKEEPER_PORT, 2181);
c.put(Config.STORM_ZOOKEEPER_SESSION_TIMEOUT, 30);
c.put(Config.STORM_ZOOKEEPER_CONNECTION_TIMEOUT, 30);
c.put(Config.STORM_ZOOKEEPER_RETRY_TIMES, 3);
c.put(Config.STORM_ZOOKEEPER_RETRY_INTERVAL, 5);
Additionally to flink-storm
you need:
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>0.9.4</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.8.1.1</version>
</dependency>
This works for me, using Kafka_2.10-0.8.1.1 and FlinkLocalCluster
execute within Eclipse.
It also works in a local Flink cluster started via bin/start-local-streaming.sh
. For this, using bin/flink run
command, you need to use FlinkSubmitter
instead of FlinkLocalCluster
. Furthermore, you need the following dependencies for your jar:
<include>org.apache.storm:storm-kafka</include>
<include>org.apache.kafka:kafka_2.10</include>
<include>org.apache.curator:curator-client</include>
<include>org.apache.curator:curator-framework</include>
<include>com.google.guava:guava</include>
<include>com.yammer.metrics:metrics-core</include>