Flink作业在第二次提交后崩溃

时间:2018-02-20 03:06:23

标签: apache-flink flink-streaming

我想从Flink作业向AWS S3传输数据。以下是将数据流式传输到S3的简单测试应用程序的链接。

https://github.com/dmiljkovic/test-flink-bucketingsink-s3

当jar被提交到我的机器上的Flink“cluster”时,IntelliJ的代码也可以工作。问题是工作只能工作一次。如果第二次提交作业,则会产生堆栈跟踪。如果我重新启动群集作业,但只是第一次。

org.apache.commons.logging.LogConfigurationException: java.lang.IllegalAccessError: tried to access class org.apache.commons.logging.impl.LogFactoryImpl$3 from class org.apache.commons.logging.impl.LogFactoryImpl (Caused by java.lang.IllegalAccessError: tried to access class org.apache.commons.logging.impl.LogFactoryImpl$3 from class org.apache.commons.logging.impl.LogFactoryImpl)
    at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:637)
    at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336)
    at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310)
    at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
    at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:76)
    at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:102)
    at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:88)
    at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:96)
    at com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:26)
    at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96)
    at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:158)
    at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:119)
    at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:389)
    at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:371)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235)
    at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1206)
    at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411)
    at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355)
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalAccessError: tried to access class org.apache.commons.logging.impl.LogFactoryImpl$3 from class org.apache.commons.logging.impl.LogFactoryImpl
    at org.apache.commons.logging.impl.LogFactoryImpl.getParentClassLoader(LogFactoryImpl.java:700)
    at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1187)
    at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914)
    at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604)
    ... 26 more
Source: Collection Source -> S3BucketingSink_write__bbb_UID_1 -> (Sink: S3BucketingSink_write__bbb_UID_2, Sink: S3BucketingSink_write__bbb_UID_3) (1/1)

1 个答案:

答案 0 :(得分:1)

这是因为“commons-logging”与反向类加载不兼容。

两个直接的解决方法是:

  • 切换到父级优先加载
  • 从应用程序jar文件中删除所有Hadoop和commons-logging代码

对于Flink 1.4.2和Flink 1.5,我们确保“commons-logging”始终加载parent-first。特别处理公共事件 - 这样的日志记录在项目中似乎很常见(JBoss,Tomcat等都这样做)。