| ID|CO_ID| DATA|
+--------------------+--------------------+----+
|ABCD123|abc12|[{"month":"Jan","day":"monday"}] |
|BCHG345|wed34|[{"month":"Jul","day":"tuessay"}]|
我上面有一个数据框,其中DATA列为StringType。我希望将其转换为StructType。我该怎么办?
答案 0 :(得分:0)
df.withColumn("data_struct",from_json($"data",StructType(Array(StructField("month", StringType),StructField("day", StringType)))))
在Spark 2.4.0上,我得到以下信息
import org.apache.spark.sql.types.{StructType, StructField, StringType}
val df = List ( ("[{\"month\":\"Jan\",\"day\":\"monday\"}]")).toDF("data")
val df2 = df.withColumn("data_struct",from_json($"data",StructType(Array(StructField("month", StringType),StructField("day", StringType)))))
df2.show
+--------------------+-------------+
| data| data_struct|
+--------------------+-------------+
|[{"month":"Jan","...|[Jan, monday]|
+--------------------+-------------+
df2.printSchema
root
|-- data: string (nullable = true)
|-- data_struct: struct (nullable = true)
| |-- month: string (nullable = true)
| |-- day: string (nullable = true)