Apache flink 1.52 Rowtime时间戳为空

时间:2018-09-17 02:06:24

标签: java apache-flink

我正在使用以下代码进行查询:

    env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
    DataStream<Row> ds = SourceHelp.builder().env(env).consumer010(MyKafka.builder().build().kafkaWithWaterMark2())
            .rowTypeInfo(MyRowType.builder().build().typeInfo())
            .build().source4();
    //,proctime.proctime,rowtime.rowtime
    String sql1 = "select a,b,max(rowtime)as rowtime from user_device group by a,b";
    DataStream<Row> ds2 = TableHelp.builder().tableEnv(tableEnv).tableName("user_device").fields("a,b,rowtime.rowtime")
            .rowTypeInfo(MyRowType.builder().build().typeInfo13())
            .sql(sql1).in(ds).build().result();

    ds2.print();
    // String sql2 = "select a,count(b) as b from user_device2 group by a";
    String sql2 = "select a,count(b) as b,HOP_END(rowtime,INTERVAL '5' SECOND,INTERVAL '30' SECOND) as c from user_device2 group by HOP(rowtime, INTERVAL '5' SECOND, INTERVAL '30' SECOND),a";
    DataStream<Row> ds3 = TableHelp.builder().tableEnv(tableEnv).tableName("user_device2").fields("a,b,rowtime.rowtime")
            .rowTypeInfo(MyRowType.builder().build().typeInfo14())
            .sql(sql2).in(ds2).build().result();

    ds3.print();
    env.execute("test");

注意:对于sql1,我将max函数与rowtime一起使用,它不起作用,并引发以下异常:

  

线程“主”中的异常   org.apache.flink.runtime.client.JobExecutionException:   java.lang.RuntimeException:行时间时间戳为null。请做出来   确保定义了正确的TimestampAssigner,并且流   环境使用EventTime时间特征。在   org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:625)     在   org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123)     在   com.aicaigroup.water.WaterTest.testRowtimeWithMoreSqls5(WaterTest.java:158)     在com.aicaigroup.water.WaterTest.main(WaterTest.java:20)上,原因是:   java.lang.RuntimeException:行时间时间戳为null。请做出来   确保定义了正确的TimestampAssigner,并且流   环境使用EventTime时间特征。在   DataStreamSourceConversion $ 24.processElement(未知源)在   org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:67)     在   org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.pushToOperator(OperatorChain.java:558)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:533)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:513)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ BroadcastingOutputCollector.collect(OperatorChain.java:628)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ BroadcastingOutputCollector.collect(OperatorChain.java:581)     在   org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:679)     在   org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:657)     在   org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)     在com.aicaigroup.TableHelp $ 1.processElement(TableHelp.java:42)处   com.aicaigroup.TableHelp $ 1.processElement(TableHelp.java:39)在   org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.pushToOperator(OperatorChain.java:558)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:533)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:513)     在   org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:679)     在   org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:657)     在   org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.pushToOperator(OperatorChain.java:558)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:533)     在   org.apache.flink.streaming.runtime.tasks.OperatorChain $ CopyingChainingOutput.collect(OperatorChain.java:513)     在   org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:679)     在   org.apache.flink.streaming.api.operators.AbstractStreamOperator $ CountingOutput.collect(AbstractStreamOperator.java:657)     在   org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)     在   org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:151)     在   org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:39)     在   org.apache.flink.streaming.api.operators.LegacyKeyedProcessOperator.processElement(LegacyKeyedProcessOperator.java:88)     在   org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)     在   org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:104)     在   org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)     在org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)处   java.lang.Thread.run(Thread.java:748)2018-09-17 09:51:53.679 [Kafka   源代码的0.10获取程序:自定义源->映射->来自:(a,b,行时间)->选择:(a,b,CAST(rowtime)AS行时间(2/8)]信息oakafka.clients.consumer .internals.AbstractCoordinator-发现   组的协调器172.16.11.91:9092(id:2147483647机架:null)   测试。

然后我尝试像这样“从user_device选择a,b,rowtime”更新sql1,并且它可以正常工作。 那么如何解决错误?第一个sql应该使用group by,第二个sql应该使用rowtime by timeWindow。 3QS

1 个答案:

答案 0 :(得分:1)

我从1.6开始flink,遇到了像您这样的类似问题。 通过这些步骤解决了:

  • 使用assignTimestampsAndWatermarks,只需使用默认的常规实现BoundedOutOfOrdernessTimestampExtractor。您需要编写extractTimestamp函数以提取时间戳记值并在构造函数中声明窗口间隔。
  • 在字段末尾附加,proctime.proctime,rowtime.rowtime(我正在使用fromDataStream(Flink 1.6)将流转换为表)
  • 如果要使用existing字段作为行时间。例如,数据源字段是“ a,clicktime,c”,则可以声明“ a,clicktime.rowtime,c”

希望它能对您有所帮助。