如何将JSON中的键转换为小写?

时间:2019-08-07 19:57:56

标签: scala apache-spark apache-spark-sql spark-structured-streaming

我有一个流查询,它以骆驼的形式从Kafka读取JSON格式的数据,并带有1000多个键。

scala> kafka_df.printSchema()
root
 |-- jsonData: struct (nullable = true)
 |    |-- header: struct (nullable = true)
 |    |    |-- batch_id: string (nullable = true)
 |    |    |-- entity: string (nullable = true)
 |    |    |-- time: integer (nullable = true)
 |    |    |-- key: array (nullable = true)
 |    |    |    |-- element: string (containsNull = true)
 |    |    |-- message_type: string (nullable = true)
 |    |-- body: string (nullable = true)

如何递归地将键更改为小写并转换回数据帧,以便我可以使用写入流进行写入?

1 个答案:

答案 0 :(得分:0)

尝试一下:

def columnsToLowercase(schema: StructType): StructType = {
   def recurRename(schema: StructType): Seq[StructField] =
      schema.fields.map {
         case StructField(name, dtype: StructType, nullable, meta) =>
            StructField(name.toLowerCase, StructType(recurRename(dtype)), nullable, meta)
         case StructField(name, dtype, nullable, meta) =>
            StructField(name.toLowerCase, dtype, nullable, meta)
      }

   StructType(recurRename(schema))
}

val newDF = sparkSession.createDataFrame(dataFrame.rdd, columnsToLowercase(dataFrame.schema))