我有一个流查询,它以骆驼的形式从Kafka读取JSON格式的数据,并带有1000多个键。
scala> kafka_df.printSchema()
root
|-- jsonData: struct (nullable = true)
| |-- header: struct (nullable = true)
| | |-- batch_id: string (nullable = true)
| | |-- entity: string (nullable = true)
| | |-- time: integer (nullable = true)
| | |-- key: array (nullable = true)
| | | |-- element: string (containsNull = true)
| | |-- message_type: string (nullable = true)
| |-- body: string (nullable = true)
如何递归地将键更改为小写并转换回数据帧,以便我可以使用写入流进行写入?
答案 0 :(得分:0)
尝试一下:
def columnsToLowercase(schema: StructType): StructType = {
def recurRename(schema: StructType): Seq[StructField] =
schema.fields.map {
case StructField(name, dtype: StructType, nullable, meta) =>
StructField(name.toLowerCase, StructType(recurRename(dtype)), nullable, meta)
case StructField(name, dtype, nullable, meta) =>
StructField(name.toLowerCase, dtype, nullable, meta)
}
StructType(recurRename(schema))
}
val newDF = sparkSession.createDataFrame(dataFrame.rdd, columnsToLowercase(dataFrame.schema))