我有一个数据框,其数据类型可以在下面看到
orders.printSchema()
root
|-- order_id: long (nullable = true)
|-- user_id: long (nullable = true)
|-- eval_set: string (nullable = true)
|-- order_number: short (nullable = true)
|-- order_dow: short (nullable = true)
|-- order_hour_of_day: short (nullable = true)
|-- days_since_prior_order: short (nullable = true)
但是当我将它注册到表时,数据类型都变为string。
orders.createOrReplaceTempView("orders")
spark.sql("describe orders").show()
+--------------------+---------+-------+
| col_name|data_type|comment|
+--------------------+---------+-------+
| order_id| string| |
| user_id| string| |
| eval_set| string| |
| order_number| string| |
| order_dow| string| |
| order_hour_of_day| string| |
|days_since_prior_...| string| |
+--------------------+---------+-------+
那么如何在pyspark中将原始类型从数据帧维护到表。
答案 0 :(得分:0)
否createOrReplaceTempView
不会更改架构。我已经在Spark Scala中进行了测试,它保留了schema
。这可能是pyspark
。