我正在将json文件读入pandas中,想知道是否有一种方法可以知道哪些列中包含字典。
下面是我的json:
{
"employeeId":"**********",
"targetDateTime":"2019-03-24T21:55:48Z",
"balances":[
{
"balanceId":"Vacation",
"encumberedMinutes":4728,
"projectedMinutes":4728,
"vestedMinutes":4728,
"timeUnitMetaData":{
"hoursPerDay":8,
"timeUnit":"HOURS"
}
},
{
"balanceId":"Unpaid Time",
"encumberedMinutes":1950,
"projectedMinutes":1950,
"vestedMinutes":3210,
"timeUnitMetaData":{
"hoursPerDay":8,
"timeUnit":"HOURS"
}
},
{
"balanceId":"Personal Time Off",
"encumberedMinutes":1693,
"projectedMinutes":600,
"vestedMinutes":1693,
"timeUnitMetaData":{
"hoursPerDay":8,
"timeUnit":"HOURS"
}
}
]
}
import pandas as pd
from pandas.io.json import json_normalize
df = pd.read_json("testing.json", convert_dates=["targetDateTime"])
查找包含字典的列名,并设置要馈送到json规范化的变量。
# Convert column of dicts to df
df_dict = pd.io.json.json_normalize(df["balances"], errors="ignore")
# Merge data frames on index
merged_df = pd.merge(df, df_dict, left_index=True, right_index=True)
# Drop columns
merged_df = merged_df[merged_df.columns.drop(list(merged_df.filter(regex="timeUnitMetaData|balances")))]
print(merged_df)