如何集体操作MapWithState函数

时间:2018-07-01 00:32:41

标签: scala apache-spark spark-streaming

我有一个流媒体工作,代码在下面:

val filterActions = userActions.filter(Utils.filterPageType)

val parseAction = filterActions.flatMap(record => ParseOperation.parseMatch(categoryMap, record))

val finalActions = parseAction.filter(record => record.get("invalid") == None)

val userModels = finalActions.map(record => (record("deviceid"), record)).mapWithState(StateSpec.function(stateUpdateFunction))

但是除mapWithState函数以外的所有函数都可以顺利编译,ParseOperation.parseMatch(categoryMap, record)的返回类型为ListBuffer[Map[String, Any]],错误如下:

[INFO] Compiling 9 source files to /Users/spare/project/campaign-project/stream-official-mall/target/classes at 1530404002409

[ERROR] /Users/spare/project/campaign-project/stream-official-mall/src/main/scala/com/shopee/mall/data/OfficialMallTracker.scala:77: error: overloaded method value function with alternatives:
[ERROR]   [KeyType, ValueType, StateType, MappedType](mappingFunction: org.apache.spark.api.java.function.Function3[KeyType,org.apache.spark.api.java.Optional[ValueType],org.apache.spark.streaming.State[StateType],MappedType])org.apache.spark.streaming.StateSpec[KeyType,ValueType,StateType,MappedType] <and>
[ERROR]   [KeyType, ValueType, StateType, MappedType](mappingFunction: org.apache.spark.api.java.function.Function4[org.apache.spark.streaming.Time,KeyType,org.apache.spark.api.java.Optional[ValueType],org.apache.spark.streaming.State[StateType],org.apache.spark.api.java.Optional[MappedType]])org.apache.spark.streaming.StateSpec[KeyType,ValueType,StateType,MappedType] <and>
[ERROR]   [KeyType, ValueType, StateType, MappedType](mappingFunction: (KeyType, Option[ValueType], org.apache.spark.streaming.State[StateType]) => MappedType)org.apache.spark.streaming.StateSpec[KeyType,ValueType,StateType,MappedType] <and>
[ERROR]   [KeyType, ValueType, StateType, MappedType](mappingFunction: (org.apache.spark.streaming.Time, KeyType, Option[ValueType], org.apache.spark.streaming.State[StateType]) => Option[MappedType])org.apache.spark.streaming.StateSpec[KeyType,ValueType,StateType,MappedType]

[ERROR]  cannot be applied to ((Any, Map[String,Any], org.apache.spark.streaming.State[Map[String,Any]]) => Some[Map[String,Any]])
[ERROR]     val userModels = finalActions.map(record => (record("deviceid"), record)).mapWithState(StateSpec.function(stateUpdateFunction))
[ERROR]                                                                                                      ^

[ERROR] one error found

是什么原因引起的?如何修改代码?

1 个答案:

答案 0 :(得分:0)

我已修复它,导致StateSpec.function(stateUpdateFunction))要求输入参数的类型为Map [String,Any],在调用它之前,我使用了map函数,代码如下:

val finalActions = parseAction.filter(record => record.get("invalid") == None).map(Utils.parseFinalRecord)
val parseFinalRecord = (record: Map[String, Any]) => {
val recordMap = collection.mutable.Map(record.toSeq: _*)
logger.info(s"recordMap: ${recordMap}")
recordMap.toMap

}

有效!