Flink,为什么CoMap会返回“DataStream with Product with Serializable”而不仅仅是DataStream?

时间:2018-02-14 13:59:01

标签: scala apache-flink flink-streaming

我需要了解为什么eventStream.connect(otherStream).map(_ => Right(2), _ => Left("2"))不生成DataStream[Either[String, Int]]而是生成DataStream[Either[String, Int]] with Product with Serializable。 我正在使用一些接受DataStream[T]的API,如果我将DataStream[T] with Product with Serializable传递给它,则会出现编译时错误。有人可以解释一下,也许会给我一些提示吗?

我给你举了一个例子:

class FlinkFoo {
  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment

    // Silly int source
    val eventStream: DataStream[Int] = env.addSource((sc: SourceContext[Int]) => {
      while (true) sc.collect(1)
    })

    // Silly String source
    val otherStream: DataStream[String] = env.addSource((sc: SourceContext[String]) => {
      while (true) sc.collect("1")
    })

    // I need to connect two stream and then flatten them
    val connectedStream2: DataStream[Either[String, Int] with Product with Serializable] = eventStream.connect(otherStream).map(_ => Right(2), _ => Left("2"))

    /* Compile time error !!!!
     * found   : org.apache.flink.streaming.api.scala.DataStream[Either[String,Int] with Product with Serializable]
     * [error]  required: org.apache.flink.streaming.api.scala.DataStream[Either[?,?]]
     * [error] Note: Either[String,Int] with Product with Serializable <: Either[?,?], but class DataStream is invariant in type T.
     * [error] You may wish to define T as +T instead. (SLS 4.5)
     * [error]     fooMethod(connectedStream2)
     * [error]               ^
     **/
    fooMethod(connectedStream2)
  }

  def fooMethod[T, P](dataStream: DataStream[Either[T, P]]): Unit = {
    // do something
  }
}

1 个答案:

答案 0 :(得分:1)

您可以尝试将Flink scala隐式序列化程序和TypeInformation添加到您的作用域,如下所示

import org.apache.flink.streaming.api.scala._

TypeUtils object由上面导入的包对象调用;它们为Either提供了序列化程序和所需的类型信息,只要对于许多其他实体。

您需要这些转换才能在Flink泛型类型解析后解析Either类型,并且您可能会明确将返回类型添加到您的分配中以实现转换。

val yourEitherStream: DataStream[Either[String, Int]] =
  eventStream
    .connect(otherStream)
    .map(_ => Right(2), _ => Left("2"))

with Product with Serializable mix-in Scala 2.11 issue的拒绝,由2.12解决(但您不能将其与Flink right now一起使用)。