如何向现有JSON文件添加新值?

时间:2017-07-27 18:03:11

标签: json scala

我有一个JSON文件

{
  "titlename": "periodic",
  "atom": [
    {
      "usage": "neutron",
      "dailydata": [
        {
          "utcacquisitiontime": "2017-03-27T22:00:00Z",
          "datatimezone": "+02:00",
          "intervalvalue": 28128,
          "intervaltime": 15
        },
        {
          "utcacquisitiontime": "2017-03-27T22:15:00Z",
          "datatimezone": "+02:00",
          "intervalvalue": 25687,
          "intervaltime": 15
        }
      ]
    }
  ]
}

我想追加

"matter":[
  {
   "usage":"neutron",
   "intervalvalue":345678
  },
  ...
]

intervalvalue中,我需要在intervalvalues中为每次使用设置dailydata的汇总值。我正在使用scala,我能够读取json文件。请帮我汇总和追加!

1 个答案:

答案 0 :(得分:1)

您应该使用数据框来获取所需的json

为此你必须将json文件转换为dataframe,这可以作为

完成
val json = sc.wholeTextFiles("path to the json file")
  .map(tuple => tuple._2.replace("\n", "").trim)

val df = sqlContext.read.json(json)

这将为您输出

+--------------------------------------------------------------------------------------------------------+---------+
|atom                                                                                                    |titlename|
+--------------------------------------------------------------------------------------------------------+---------+
|[[WrappedArray([+02:00,15,28128,2017-03-27T22:00:00Z], [+02:00,15,25687,2017-03-27T22:15:00Z]),neutron]]|periodic |
+--------------------------------------------------------------------------------------------------------+---------+

你应该从数据框中提取使用和间隔,这可以作为

完成
import org.apache.spark.sql.functions._
val tobemergedDF = df.withColumn("atom", explode(col("atom")))
    .withColumn("usage", col("atom.usage"))
    .withColumn("atom", explode(col("atom.dailydata")))
    .withColumn("intervalvalue", col("atom.intervalvalue"))
    .groupBy("usage").agg(sum("intervalvalue").as("intervalvalue"))

tobemergedDF为

+-------+-------------+
|usage  |intervalvalue|
+-------+-------------+
|neutron|53815        |
+-------+-------------+

现在你可以把数据帧写成json并合并两个文件。

希望答案有用