重命名架构中的嵌套Json数据

时间:2019-07-01 16:20:37

标签: scala apache-spark apache-spark-sql databricks

大家好,我是Spark / Scala的新手,我想重命名一些嵌套的JSON字段,因为在进行横向视图时,它会失败,因为有多个具有相同名称的JSON字段。

我想重命名EmployeeAddr和EmployeePhone中的EffDate和ExpDate列。

我已经尝试过withColumnRenamed和withColumn函数,但是由于某种原因,两者都对我不起作用。


Code to load into dataframe:
val Employee= spark.read.format(Employeefile_type).option("header", "true").option("inferSchema","true").load(file_loction)



root
 |-- BirthDate: string (nullable = true)
 |-- EmployeeId: string (nullable = true)
 |-- EmployeeAddr: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- AddrTypeName: string (nullable = true)
 |    |    |-- City: string (nullable = true)
 |    |    |-- CtryCode: string (nullable = true)
 |    |    |-- EffDate: string (nullable = true)
 |    |    |-- ExpDate: string (nullable = true)
 |    |    |-- PostalCode: string (nullable = true)
 |    |    |-- Province: string (nullable = true)
 |    |    |-- Street1: string (nullable = true)
 |    |    |-- Street2: string (nullable = true)
 |-- EmployeeEmail: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- CrewEmailAddr: string (nullable = true)
 |    |    |-- EmailType: string (nullable = true)
 |-- EmployeeEmerContact: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- Addr: string (nullable = true)
 |    |    |-- FirstName: string (nullable = true)
 |    |    |-- LastName: string (nullable = true)
 |    |    |-- PrimaryPhone: string (nullable = true)
 |    |    |-- Relatnshp: string (nullable = true)
 |    |    |-- Title: string (nullable = true)
 |-- EmployeeEmplymntStatus: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- EmplymntStatusCode: string (nullable = true)
 |    |    |-- EmplymntStatusReason: string (nullable = true)
 |    |    |-- EndDate: string (nullable = true)
 |    |    |-- StartDate: string (nullable = true)
 |-- EmployeePhone: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- EmployeePhoneNumber: string (nullable = true)
 |    |    |-- EffDate: string (nullable = true)
 |    |    |-- ExpDate: string (nullable = true)
 |    |    |-- PhoneType: string (nullable = true)

1 个答案:

答案 0 :(得分:0)

您可以应用此处描述的解决方案:

How to rename fields in an DataFrame corresponding to nested JSON

执行以下操作,替换DataFrame架构(用新架构重新创建DataFrame。