在Spark结构化流中创建嵌套列

时间:2019-08-06 09:04:12

标签: apache-spark spark-structured-streaming spark-java

如何在Spark Streaming中将列转换为嵌套列?

实际:

root
 |-- channelId: string (nullable = true)
 |-- country: string (nullable = true)
 |-- product: string (nullable = true)
 |-- sourceId: string (nullable = true)
 |-- systemId: string (nullable = true)
 |-- destinationId: string (nullable = true)

预期:

root
|-- txn_summary: struct (nullable = true)
|   |-- channelId: string (nullable = true)
|   |-- country: string (nullable = true)
|   |-- product: string (nullable = true)
|   |-- sourceId: string (nullable = true)
|   |-- systemId: string (nullable = true)
|   |-- destinationId: string (nullable = true)

1 个答案:

答案 0 :(得分:0)

您可以将结构function用作

import org.apache.spark.sql.functions.*;
dataframe.select(struct(dataframe.columns()).as("txn_summary"));