我有这个简短版本的ADSB json数据,并希望将其转换为DataFrame列,如Icao,Alt,Lat,Long,Spd,Cou ......
在Alperen告诉我这样做之后
df = pd.read_json('2016-06-20-2359Z.json', lines=True),
我可以将其加载到DataFrame中。但是,df.acList
是
[{'Id': 10537990, 'Rcvr': 1, 'HasSig': False, ... Name: acList, dtype: object
如何获取Icao,Alt,Lat,Long,Spd,Cou数据?
"src":1, "feeds":[ { "id":1, "name":"ADSBexchange.com", "polarPlot":false } ], "srcFeed":1, "showSil":true, "showFlg":true, "showPic":true, "flgH":20, "flgW":85, "acList":[ { "Id":11281748, "Rcvr":1, "HasSig":false, "Icao":"AC2554", "Bad":false, "Reg":"N882AS", "FSeen":"\/Date(1466467166951)\/", "TSecs":3, "CMsgs":1, "AltT":0, "Tisb":false, "TrkH":false, "Type":"CRJ2", "Mdl":"2001 BOMBARDIER INC CL-600-2B19", "Man":"Bombardier", "CNum":"7503", "Op":"EXPRESSJET AIRLINES INC - ATLANTA, GA", "OpIcao":"ASQ", "Sqk":"", "VsiT":0, "WTC":2, "Species":1, "Engines":"2", "EngType":3, "EngMount":1, "Mil":false, "Cou":"United States", "HasPic":false, "Interested":false, "FlightsCount":0, "Gnd":false, "SpdTyp":0, "CallSus":false, "TT":"a", "Trt":1, "Year":"2001" }, { "Id":11402205, "Rcvr":1, "HasSig":true, "Sig":110, "Icao":"ADFBDD", "Bad":false, "FSeen":"\/Date(1466391940977)\/", "TSecs":75229, "CMsgs":35445, "Alt":8025, "GAlt":8025, "AltT":0, "Call":"TEST1234", "Tisb":false, "TrkH":false, "Sqk":"0262", "Help":false, "VsiT":0, "WTC":0, "Species":0, "EngType":0, "EngMount":0, "Mil":true, "Cou":"United States", "HasPic":false, "Interested":false, "FlightsCount":0, "Gnd":true, "SpdTyp":0, "CallSus":false, "TT":"a", "Trt":1 } ], "totalAc":4231, "lastDv":"636019887431643594", "shtTrlSec":61, "stm":1466467170029 }
答案 0 :(得分:6)
如果您已将数据放在pandas DataFrame的acList
列中,只需执行以下操作:
import pandas as pd
pd.io.json.json_normalize(df.acList[0])
Alt AltT Bad CMsgs CNum Call CallSus Cou EngMount EngType ... Sqk TSecs TT Tisb TrkH Trt Type VsiT WTC Year
0 NaN 0 False 1 7503 NaN False United States 1 3 ... 3 a False False 1 CRJ2 0 2 2001
1 8025.0 0 False 35445 NaN TEST1234 False United States 0 0 ... 0262 75229 a False False 1 NaN 0 0 NaN
答案 1 :(得分:3)
@Sergey的答案为我解决了这个问题,但是我遇到了问题,因为数据框列中的json保留为字符串而不是对象。我必须添加映射列的附加步骤:
import json
import pandas as pd
pd.io.json.json_normalize(df.acList.apply(json.loads))
答案 2 :(得分:0)
我尚无法对ThinkBonobo的答案发表评论,但是如果该列中的JSON并非完全是字典,那么您可以继续进行.apply
直到完成。所以就我而言
import json
import pandas as pd
json_normalize(
df
.theColumnWithJson
.apply(json.loads)
.apply(lambda x: x[0]) # the inner JSON is list with the dictionary as the only item
)
答案 3 :(得分:0)
自pandas 1.0起,json_normalize在顶级名称空间中可用。 因此使用:
type B = {
key: "str";
type: string;
} | {
key: "num";
type: number;
}
答案 4 :(得分:0)
在我的情况下,我缺少一些值(None
),然后创建了一个更具体的代码,该代码在创建新值之后也会删除原始列:
for prefix in ['column1', 'column2']:
df_temp = df[prefix].apply(lambda x: {} if pd.isna(x) else x)
df_temp = pd.io.json.json_normalize(df_temp)
df_temp = df_temp.add_prefix(prefix + '_')
df.drop([prefix], axis=1, inplace=True)
df = pd.concat([df, df_temp], axis = 1, sort=False)