我想知道如何将数据帧转换为json格式。
name ㅣ type ㅣ count
'james'ㅣ 'message'ㅣ 4
'kane' ㅣ 'text' ㅣ 3
'james'ㅣ 'text' ㅣ 2
'kane' ㅣ 'message'ㅣ 3
----------------------------结果------------------ --------------
将数据帧转换为json格式
data = [
{name : 'james', 'message' : 4, 'text; : 2}, {'name' : 'kane', 'message' :3, 'text' : 3}
]
如何将dataframe更改为json数据?
答案 0 :(得分:1)
您可以使用to_json
和collect_list
函数。
import pyspark.sql.functions as f
df1 = df.withColumn('json', f.struct('name', 'type', 'count')) \
.groupBy().agg(f.collect_list('json').alias('data')) \
.withColumn('data', f.to_json(f.struct(f.col('data')))) \
.show(10, False)
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|data |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|{"data":[{"name":"james","type":"message","count":4.0},{"name":"kane","type":"text","count":3.0},{"name":"james","type":"text","count":2.0},{"name":"kane","type":"message","count":3.0}]}|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+