Spark type mismatch error - found : Any but required: List

时间:2018-02-03 10:30:48

标签: scala apache-spark apache-spark-sql

I'm facing issue with auto casting from array to any.

scala> val selectedFieldsDF = dfFinal.select("id","attributes");
scala> selectedFieldsDF.printSchema
root
 |-- id: long (nullable = true)
 |-- attributes: array (nullable = true)
 |    |-- element: map (containsNull = true)
 |    |    |-- key: string
 |    |    |-- value: string (valueContainsNull = true)

Right now dataframe has only one record:

scala> selectedFieldsDF.show(20, false)
+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|id|attributes                                                                                                                                                                                                                                                                                          |
+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|1425942469761    |[Map(attribute_value -> Bordspellen, column_id -> 2958, attribute_name -> CAT level 2:Soort gezelschapsspellen), Map(attribute_value -> Gezelschapsspellen, column_id -> 2956, attribute_name -> CAT level 1:Soort), Map(attribute_value -> Spelshop.be, column_id -> 47, attribute_name -> Winkel)]|
+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Defined allItems class:

scala> case class allItems(id: Long, attr: List[Map[String,String]], valid: Boolean)
    defined class allItems

selectedFieldsDF transforming to another df using case class allItems

But I am facing type mismatch issue as shown below even though item(1) is List[Map[String,String]](you can see schema above)

scala> val allPreItemsDF = selectedFieldsDF.rdd.map({item=> allItems(toLong(item(0).toString),item(1),true)
     | })
<console>:36: error: type mismatch;
 found   : Any
 required: List[Map[String,String]]
       val allPreItemsDF = selectedFieldsDF.rdd.map({item=> allItems(toLong(item(0).toString),item(1),true)
                                                                                                  ^

1 个答案:

答案 0 :(得分:2)

You should use:

selectedFieldsDF.rdd.map({item=> allItems(item.getLong(0),item.getAs[List[Map[String,String]]](1),true)})

Use Spark's built-in field conversion methods