Scala新手!!我有List[Row]
其中Row
是org.apache.spark.sql.Row
,它有类似
val list = List("""{"name":"abc","salary":"somenumber","id":"1"}"""")
如何用其他东西替换关键薪水?
答案 0 :(得分:1)
请找到以下解决方案。我希望,它会帮助你
如果您的输入如下所示
scala> val list = List("""{"name":"abc","salary":"somenumber","id":"1"}""")
list: List[String] = List({"name":"abc","salary":"somenumber","id":"1"})
我正在将List[org.apache.spark.sql.Row]
转换为List[scala.collection.immutable.Map[String,String]]
scala> val listOfMaps=list.map(Row=>Row(0).toString.replaceAll("[{}]","").split(",").map(str=>(str.split(":")(0),str.split(":")(1))).toMap)
listOfMaps: List[scala.collection.immutable.Map[String,String]] = List(Map(name -> abc, salary -> somenumber, id -> 1"))
由于,我无法更新immutable
地图的值,以将其转换为mutable
地图并更新值
import collection.mutable.Map
scala> val mutableMap=listOfMaps.map(mp=>collection.mutable.Map(mp.toSeq: _*)).map(mp=>mp+("\""+"salary"+"\""->"2000"))
mutableMap: List[scala.collection.mutable.Map[String,String]] = List(Map(name -> abc, salary -> 2000, id -> 1"))
以List[Row]
scala> val ans=mutableMap.map(mp=>Row("{"+mp.mkString(",").replaceAll("->",":")+"}"))
ans: List[org.apache.spark.sql.Row] = List([{name : abc,salary : 2000,id : 1"}])
答案 1 :(得分:0)
如果要在spark SQL行中维护数据,则可以在spark本身中执行groupBy操作。见下文取自How to calculate sum and count in a single groupBy?
// In 1.3.x, in order for the grouping column "department" to show up,
// it must be included explicitly as part of the agg function call.
df.groupBy("department").agg($"department", max("age"), sum("expense"))
// In 1.4+, grouping column "department" is included automatically.
df.groupBy("department").agg(max("age"), sum("expense"))
随时对密切复制进行投票,将其置于此处直至做出决定。