如何在Dataframes中将顶部列重命名为struct column

时间:2017-04-10 18:21:46

标签: apache-spark apache-spark-sql

如何重命名以下内容: 架构:

root
  |-- specialTypeCol_temp: string (nullable = true)
  |-- meta: struct (nullable = false)
      |-- validations: array (nullable = true)
To
root
  |-- meta: struct (nullable = false)
      |-- specialTypeCol_temp: string (nullable = true)
      |-- validations: array (nullable = true)

Schema确实存在于Meta Struct中。

1 个答案:

答案 0 :(得分:0)

使用select重命名:

import org.apache.spark.sql.functions._

df.select(struct(
  col("meta.validations").alias("validations"),
  col("specialTypeCol")
).alias("meta"))

在Python中相同:

from pyspark.sql.functions import col, struct

df.select(struct(
    col("meta.validations").alias("validations"),
    col("specialTypeCol")
).alias("meta"))