从ValueState获取数据时,Flink中的EXCEPTION_ACCESS_VIOLATION

时间:2017-02-06 08:36:26

标签: java fatal-error apache-flink flink-streaming

从Flink中的共享ValueState读取数据时,我在JVM中遇到了无法解释的崩溃。如果我偶然发现了Flink(或其他地方)的一个错误,或者它是否是预期的行为(虽然我怀疑是“JRE中的致命错误”),我不确定我是否做了一些愚蠢的事情。 。任何人都可以解释如何解决/解决这个问题吗?

我收到的错误消息是:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x0000000054e9a390, pid=6328, tid=0x00000000000002f8
#
# JRE version: Java(TM) SE Runtime Environment (8.0_111-b14) (build 1.8.0_111-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode windows-amd64 compressed oops)
# Problematic frame:
# C  [zip.dll+0xa390]
#
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#
# An error report file with more information is saved as:
# C:\Users\bjornper\eclipse_workspace\TestEnvironment\hs_err_pid6328.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

我可以使用以下程序重现崩溃:

主要课程:

package flinkjvmcrash;

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class MainMinimalCrash {

    public static void main(String[] args) throws Exception {

        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        //env.setParallelism(1);

        DataStream<LogRow> logRowDataStream = env.addSource(new MyDataSource());

        logRowDataStream.keyBy("sourceId").flatMap(new Aggregator());

        env.execute("My Data Flow");
    }
}

聚合:

package flinkjvmcrash;

import org.apache.flink.api.common.functions.RichFlatMapFunction;
import org.apache.flink.api.common.state.ValueState;
import org.apache.flink.api.common.state.ValueStateDescriptor;
import org.apache.flink.api.common.typeinfo.TypeHint;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.util.Collector;

public class Aggregator extends RichFlatMapFunction<LogRow, LogRow> {

    // I need this class (or something similar) to process delta values between incoming LogRow objects
    public class AggregationData {
        public String release = "";
        public long timestamp = 0;
    }

    private transient ValueState<AggregationData> aggregationData;

    @Override
    public void open(Configuration config) {
        ValueStateDescriptor<AggregationData> descriptorAggregationData =
                new ValueStateDescriptor<AggregationData>(
                        "aggregationData",
                        TypeInformation.of(new TypeHint<AggregationData>() {}),
                        new AggregationData());
        aggregationData = getRuntimeContext().getState(descriptorAggregationData);
    }

    @Override
    public void flatMap(LogRow value, Collector<LogRow> out) throws Exception {
        AggregationData data = aggregationData.value(); // Commenting out this row makes the bug disappear

        // This function will of course do more work, but that is not relevant to the bug/crash
    }
}

数据来源:

package flinkjvmcrash;

import org.apache.flink.streaming.api.functions.source.RichSourceFunction;

public class MyDataSource extends RichSourceFunction<LogRow> {

    @Override
    public void run(SourceContext<LogRow> ctx) throws Exception {
        // produces 8000 data objects in quick succession
        for (int i = 0; i < 8000; i++) {
            LogRow logRow = new LogRow();
            ctx.collect(logRow);
        }
    }

    @Override
    public void cancel() {
    }
}

流中使用的数据对象:

package flinkjvmcrash;

public class LogRow {
    public String sourceId;
    public String release;
    public Long timestamp;

    public Integer lotsOfMoreFields;
}

我使用的环境是:

  • Java 8
  • Flink 1.1.3
  • Eclipse IDE

系统信息:

OS: Windows 10.0 , 64 bit Build 14393 (10.0.14393.0)

CPU:total 4 (2 cores per cpu, 2 threads per core) family 6 model 60 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2

Memory: 4k page, physical 8299612k(4301944k free), swap 11052124k(6589788k free)

vm_info: Java HotSpot(TM) 64-Bit Server VM (25.111-b14) for windows-amd64 JRE (1.8.0_111-b14), built on Sep 22 2016 19:24:05 by "java_re" with MS VC++ 10.0 (VS2010)

编辑1:

通过将AggregationData内部类(ValueState中使用的内容)更改为Tuple2<String, Long>代替,问题不再出现。

编辑2:

现在是我第一次发布问题的第二天,相同的代码不再重现问题......计算机已在夜间关闭,但除此之外我不知道为什么问题会现在没有出现......(即使昨天我重新启动后我也可以重现问题......)

在@ rmetzger的请求中,这是hs_err_pid6328.log文件:https://gist.github.com/Plankton555/e6ba1224b34035e91c5d5933f1c73549

0 个答案:

没有答案