如何在Flink中设置RocksDBStateBackend参数?

时间:2017-06-29 08:04:39

标签: apache-flink

我使用以下代码设置RocksDBStateBackend及其选项,它可以在本地正确运行,但无法提交到群集。

    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    RocksDBStateBackend rocksDBBackEnd = new RocksDBStateBackend("file:///Users/zsh/tmp/rocksdb");
    rocksDBBackEnd.setOptions(new OptionsFactory() {
        @Override
        public DBOptions createDBOptions(DBOptions currentOptions) {
            return currentOptions;
        }

        @Override
        public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions currentOptions) {

            final long blockCacheSize = 8 * 1024 * 1024;
            final long blockSize = 4 * 1024;
            final long targetFileSize = 2 * 1024 * 1024;
            final long writeBufferSize = 64 * 1024 * 1024;
            final int writeBufferNum = 1; //default 2
            final int minBufferToMerge = 1; //default 2

            return currentOptions
                    .setCompactionStyle(CompactionStyle.LEVEL)
                    .setTargetFileSizeBase(targetFileSize)
                    .setWriteBufferSize(writeBufferSize)
                    .setMaxWriteBufferNumber(writeBufferNum)
                    .setMinWriteBufferNumberToMerge(minBufferToMerge)
                    .setTableFormatConfig(
                            new BlockBasedTableConfig()
                                    .setBlockCacheSize(blockCacheSize)
                                    .setBlockSize(blockSize)
                    );

        }
    });
    env.setStateBackend(rocksDBBackEnd);
    ....
    env.execute();

当我以这种方式提交工作时:

flink run -d -c gerryzhou.metricTest target/gerryzhou.flink-1.0-SNAPSHOT.jar

它抛出异常:

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: JobManager did not respond within 60000 milliseconds
at org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:505)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:103)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:76)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:387)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1129)

jobmanager.log看起来像这样

        2017-06-29 15:37:16,651 WARN  akka.remote.ReliableDeliverySupervisor                      - Association with remote system [akka.tcp://flink@10.242.98.255:51891] has failed, address is now gated for [5000] ms. Reason: [gerryzhou.metricTest$1]
    2017-06-29 15:37:16,651 ERROR Remoting                                                      - gerryzhou.metricTest$1
    java.lang.ClassNotFoundException: gerryzhou.metricTest$1
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:677)

我已更改了我的代码并使用单个类文件OptionsFactory实现了MRocksDBFactory,请使用此rocksDBBackEnd.setOptions(new MRocksDBFactory());。 jobManager.log中的错误信息变为:

2017-06-29 16:29:27,162 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@10.242.98.255:52638] has failed, address is now gated for [5000] ms. Reason: [gerryzhou.MRocksDBFactory]
2017-06-29 16:29:27,163 ERROR Remoting                                                      - gerryzhou.MRocksDBFactory
java.lang.ClassNotFoundException: gerryzhou.MRocksDBFactory
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:677)
    at akka.util.ClassLoaderObjectInputStream.resolveClass(ClassLoaderObjectInputStream.scala:19)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1819)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1986)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
    at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)
    at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)
    at scala.util.Try$.apply(Try.scala:192)
    at akka.serialization.Serialization.deserialize(Serialization.scala:98)
    at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23)
    at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:58)
    at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:58)
    at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:76)
    at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:967)
    at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
    at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:437)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
    at akka.actor.ActorCell.invoke(ActorCell.scala:487)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
    at akka.dispatch.Mailbox.run(Mailbox.scala:220)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

有人能帮助我吗?

0 个答案:

没有答案