从spark dataframe将数据写入csv时。我只想从数字数据中删除引号。
实际输出:
+-------+---------+-----+
|user_id|course |marks|
+-------+---------+-----+
| "1"| "eng"| "9"|
| "1"| "french"| "7"|
+-------+---------+-- ---+
预期输出
+-------+---------+-----+
|user_id|course |marks|
+-------+---------+-----+
| 1| "eng"| 9|
| 1| "french"| 7|
+-------+---------+-----+
答案 0 :(得分:0)
在DF中,cast数字列Data type to Integer Type,
import org.apache.spark.sql.types.IntegerType
df
.select(df("user_id").cast(IntegerType), df("course"), df("marks").cast(IntegerType))
.show()