我需要了解为什么eventStream.connect(otherStream).map(_ => Right(2), _ => Left("2"))
不生成DataStream[Either[String, Int]]
而是生成DataStream[Either[String, Int]] with Product with Serializable
。
我正在使用一些接受DataStream[T]
的API,如果我将DataStream[T] with Product with Serializable
传递给它,则会出现编译时错误。有人可以解释一下,也许会给我一些提示吗?
我给你举了一个例子:
class FlinkFoo {
def main(args: Array[String]): Unit = {
val env = StreamExecutionEnvironment.getExecutionEnvironment
// Silly int source
val eventStream: DataStream[Int] = env.addSource((sc: SourceContext[Int]) => {
while (true) sc.collect(1)
})
// Silly String source
val otherStream: DataStream[String] = env.addSource((sc: SourceContext[String]) => {
while (true) sc.collect("1")
})
// I need to connect two stream and then flatten them
val connectedStream2: DataStream[Either[String, Int] with Product with Serializable] = eventStream.connect(otherStream).map(_ => Right(2), _ => Left("2"))
/* Compile time error !!!!
* found : org.apache.flink.streaming.api.scala.DataStream[Either[String,Int] with Product with Serializable]
* [error] required: org.apache.flink.streaming.api.scala.DataStream[Either[?,?]]
* [error] Note: Either[String,Int] with Product with Serializable <: Either[?,?], but class DataStream is invariant in type T.
* [error] You may wish to define T as +T instead. (SLS 4.5)
* [error] fooMethod(connectedStream2)
* [error] ^
**/
fooMethod(connectedStream2)
}
def fooMethod[T, P](dataStream: DataStream[Either[T, P]]): Unit = {
// do something
}
}
答案 0 :(得分:1)
您可以尝试将Flink scala隐式序列化程序和TypeInformation
添加到您的作用域,如下所示
import org.apache.flink.streaming.api.scala._
TypeUtils
object由上面导入的包对象调用;它们为Either
提供了序列化程序和所需的类型信息,只要对于许多其他实体。
您需要这些转换才能在Flink泛型类型解析后解析Either
类型,并且您可能会明确将返回类型添加到您的分配中以实现转换。
val yourEitherStream: DataStream[Either[String, Int]] =
eventStream
.connect(otherStream)
.map(_ => Right(2), _ => Left("2"))
with Product with Serializable
mix-in 是Scala 2.11 issue的拒绝,由2.12解决(但您不能将其与Flink right now一起使用)。