如何替换scala中List [Row]中特定键的值

时间:2018-06-17 08:06:56

标签: scala apache-spark

Scala新手!!我有List[Row]其中Roworg.apache.spark.sql.Row,它有类似

的内容
val list =  List("""{"name":"abc","salary":"somenumber","id":"1"}"""")

如何用其他东西替换关键薪水?

2 个答案:

答案 0 :(得分:1)

请找到以下解决方案。我希望,它会帮助你

如果您的输入如下所示

scala>  val list = List("""{"name":"abc","salary":"somenumber","id":"1"}""")
list: List[String] = List({"name":"abc","salary":"somenumber","id":"1"})

我正在将List[org.apache.spark.sql.Row]转换为List[scala.collection.immutable.Map[String,String]]

scala> val listOfMaps=list.map(Row=>Row(0).toString.replaceAll("[{}]","").split(",").map(str=>(str.split(":")(0),str.split(":")(1))).toMap)
listOfMaps: List[scala.collection.immutable.Map[String,String]] = List(Map(name -> abc, salary -> somenumber, id -> 1"))

由于,我无法更新immutable地图的值,以将其转换为mutable地图并更新值

import collection.mutable.Map
scala> val mutableMap=listOfMaps.map(mp=>collection.mutable.Map(mp.toSeq: _*)).map(mp=>mp+("\""+"salary"+"\""->"2000"))
mutableMap: List[scala.collection.mutable.Map[String,String]] = List(Map(name -> abc, salary -> 2000, id -> 1"))

List[Row]

的原始格式获取输出
scala> val ans=mutableMap.map(mp=>Row("{"+mp.mkString(",").replaceAll("->",":")+"}"))
ans: List[org.apache.spark.sql.Row] = List([{name : abc,salary : 2000,id : 1"}])

答案 1 :(得分:0)

如果要在spark SQL行中维护数据,则可以在spark本身中执行groupBy操作。见下文取自How to calculate sum and count in a single groupBy?

// In 1.3.x, in order for the grouping column "department" to show up, 
// it must be included explicitly as part of the agg function call. 
df.groupBy("department").agg($"department", max("age"), sum("expense"))

// In 1.4+, grouping column "department" is included automatically. 
df.groupBy("department").agg(max("age"), sum("expense"))

随时对密切复制进行投票,将其置于此处直至做出决定。