json_normalize和多个record_path值遇到问题

时间:2019-08-16 12:17:06

标签: python json pandas

很抱歉是否曾经问过这个问题。我是熊猫新手,正在尝试使用json_normalize()将嵌套的API响应展平为表格格式。我在解决如何在record_path参数中放入不同的嵌套方面遇到问题。我的当前代码一直显示{{1} }

我对尝试什么或下一步去感到迷茫。谢谢。

所需的输出:

result = result[spec]KeyError: 'Type'

代码段:

  Count     Metric  Title   Platform   Begin_Date   End_Date        Type         Value
   1    Total_Req   AACN      OVID     2019-01-01   2019-02-28    Print_ISSN  1234-5678

JSON片段

    try:
    # get data from vendors
    data =json.loads(response.text)

    print("Processing Data....")
    table = json_normalize(data['Report_Items'][0],record_path =[
        'Performance','Instance','Item_ID'], meta=['Title','Platform',['Performance','Period','Begin_Date'],['Item_ID','Type'],['Performance','Period','End_Date'],'Publisher',,errors='ignore',record_prefix = "Test_",sep ='_')
    table.to_html('october_stats.html')# output to a html file
    table.to_excel('annual_stats.xlsx', sheet_name = 'NE_Stats')#output to Excel file
except json.decoder.JSONDecodeError:
    print(data, "This is not a JSON format..")  # catch vendor JSON errors

1 个答案:

答案 0 :(得分:0)

您可能需要对它进行一些操作,因为您的预期输出与示例json片段不匹配,但这可以使您前进:

data =json.loads(response.text)

def flatten_json(y):
    out = {}
    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x
    flatten(y)
    return out


flat = flatten_json(data)


results = pd.DataFrame()
special_cols = []

columns_list = list(flat.keys())
for item in columns_list:
    try:
        row_idx = re.findall(r'\_(\d+)\_', item )[0]
    except:
        special_cols.append(item)
        continue
    column = re.findall(r'\_\d+\_(.*)', item )[0]
    column = column.replace('_', '')

    row_idx = int(row_idx)
    value = flat[item]

    results.loc[row_idx, column] = value

for item in special_cols:
    results[item] = flat[item]

输出:

print (results.to_string())
          Type          Value PeriodBeginDate PeriodEndDate  Instance0MetricType  Instance0Count   Instance1MetricType  Instance1Count                        Title Platform                                     Publisher
0  Proprietary     Ovid:21790      2019-02-01    2019-02-28  Total_Item_Requests             1.0  Unique_Item_Requests             1.0  AACN Advanced Critical Care   OvidMD  American Association of Critical Care Nurses
1  Proprietary  Ovid:01256961             NaN           NaN                  NaN             NaN                   NaN             NaN  AACN Advanced Critical Care   OvidMD  American Association of Critical Care Nurses