拥有一个包含数组的数组的Json文件 我可以使用下面的代码获取所有“部分”,但无法找出json_normalize parms用法来提取嵌套数组中的不同级别?
即想从车辆数组中获取'id',其中'id'来自模型数组,其中包含所有部件数组
car | camry | "value":"engine","price":10.82
由于
f = open('sample.json')
data = json.load(f)
f.close()
df1 = json_normalize(data['vehicle'], 'model')
df2 = df1[['parts']]
ddf = pd.DataFrame(columns=['value','charge'])
for (index,row) in df2.iterrows():
if pd.notnull(row[0]):
e = row[0]
ddf.loc[index] = [e[0]['value'], e[0]['charge']]
{
"vehicle":[
{
"id":"car",
"model":[
{
"id":"camry",
"parts": [
{
"value":"engine",
"charge":10.82
} ] }
,
{
"id":"avelon",
"parts": [
{
"value":"seats",
"charge":538.26
} ] }
,
{
"id":"prius",
"parts": [
{
"value":"seats",
"charge":10.91
} ] }
,
{
"id":"corolla",
"markup": {
"value":"61"
}
,
"accessories": [
{
"value":"vvvvv"
}]
} ] } ] }
答案 0 :(得分:1)
我认为你需要:
#remove NaNs
s = df1['parts'].dropna()
#create new DataFrame, assuming only one list always
df2 = pd.DataFrame(s.str[0].values.tolist(), index=s.index)
print (df2)
charge value
0 10.82 engine
1 538.26 seats
2 10.91 seats
#join to original
df = df1[['id']].join(df2)
print (df)
id charge value
0 camry 10.82 engine
1 avelon 538.26 seats
2 prius 10.91 seats
3 corolla NaN NaN