我有一个数据框,其中有两列包含json数据,我想将该json数据解析为我的数据框所在的列
+------------+---------+--------------------+--------------------+
| firstname| lastname| travellerdetails| bookjson|
+------------+---------+--------------------+--------------------+
| K| Gupta|[{FlierNumber:","...|[{origin:DEL","Et...|
| K| Gupta|[{FlierNumber:","...|[{origin:DEL","Et...|
|Jana Ranjani|Raghu Raj|[{BaggageTypeRetu...|[{origin:AMD","De...|
+------------+---------+--------------------+--------------------+
有两列包含json数据,我想解析该列
The first row of travellerdetails is
:
""[{""""FlierNumber"""":""""""""","BaggageTypeReturn"""":""""""""","FirstName"""":""""K""""","Title"""":""""1""""","MiddleName"""":""""D""""","LastName"""":""""Gupta""""","MealTypeOnward"""":""""""""","DateOfBirth"""":""""""""","BaggageTypeOnward"""":""""""""","SeatTypeOnward"""":""""""""","MealTypeReturn"""":""""""""","FrequentAirline"""":null","Type"""":""""A""""","SeatTypeReturn"""":""""""""}","{""""FlierNumber"""":""""""""","BaggageTypeReturn"""":""""""""","FirstName"""":""""Sweety""""","Title"""":""""2""""","MiddleName"""":""""""""","LastName"""":""""Gupta""""","MealTypeOnward"""":""""""""","DateOfBirth"""":""""""""","BaggageTypeOnward"""":""""""""","SeatTypeOnward"""":""""""""","MealTypeReturn"""":""""""""","FrequentAirline"""":null","Type"""":""""A""""","SeatTypeReturn"""":""""""""}]""
the first row of bookjson is
:
""[{""""origin"""":""""DEL""""","EticketFlag"""":""""false""""","flightcode"""":""""251""""","farebasis"""":""""L0IP""""","spicestatus"""":""""Canceled""""","deptime"""":""""07:20""""","codeshare"""":""""""""","ibibopartner"""":""""indigonew""""","productclass"""":""""R""""","duration"""":""""2h 5m""""","ruleno"""":""""4910""""","qtype"""":""""fbs""""","tickettype"""":""""e""""","flightno"""":""""251""""","servicetype"""":""""""""","fareclass"""":""""L""""","faresequence"""":""""1""""","destination"""":""""GAU""""","carrierid"""":""""6E""""","stops"""":""""0""""","state"""":""""New""""","fare"""":{""""adultphf"""":50","adultttf"""":75","adultdf"""":115","totalsurcharge"""":0","indigonewgrossamount"""":10202","adulttotalfare"""":5101","totalcommission"""":0","adultbasefare"""":4150","totalpassengerhandlingfee"""":0","adultudf"""":562","adultpassengerservicefee"""":149","totalpassengerservicefee"""":0","totalothers"""":0","childtotalfare"""":0","totalbasefare"""":8300","totalfare"""":101...
请帮我解析该列.. ??
答案 0 :(得分:0)
您要寻找的是F.from_json()
。
您将像这样使用它:
from pyspark.sql import functions as F
df = df.withColumn("travellerdetails", F.from_json(F.col("travellerdetails")))
df = df.withColumn("bookjson", F.from_json(F.col("bookjson")))
但是,请注意,您在问题中提供的JSON无效,因此将产生null
。
另外请注意,您可以将架构作为第二个参数传递给from_json
-这样可以加快解析速度,并允许您为每个字段指定所需的数据类型。