我有以下架构的DataFrame:
(map, filter, some...)
我想要破坏结构,以便asin,customerId,eventTime等所有元素成为DataFrame中的列。我试过爆炸功能,但它适用于数组而不是结构类型。是否可以将有能力的数据帧转换为低于数据帧:
|-- data: struct (nullable = true)
| |-- asin: string (nullable = true)
| |-- customerId: long (nullable = true)
| |-- eventTime: long (nullable = true)
| |-- marketplaceId: long (nullable = true)
| |-- rating: long (nullable = true)
| |-- region: string (nullable = true)
| |-- type: string (nullable = true)
|-- uploadedDate: long (nullable = true)
答案 0 :(得分:2)
这很简单:
val newDF = df.select("uploadedDate", "data.*");
您告诉选择uploadedDate然后选择字段数据的所有子元素
示例:
scala> case class A(a: Int, b: Double)
scala> val df = Seq((A(1, 1.0), "1"), (A(2, 2.0), "2")).toDF("data", "uploadedDate")
scala> val newDF = df.select("uploadedDate", "data.*")
scala> newDF.show()
+------------+---+---+
|uploadedDate| a| b|
+------------+---+---+
| 1| 1|1.0|
| 2| 2|2.0|
+------------+---+---+
scala> newDF.printSchema()
root
|-- uploadedDate: string (nullable = true)
|-- a: integer (nullable = true)
|-- b: double (nullable = true)