将字典列表解压缩到Pandas数据框中

时间:2020-10-04 11:05:06

标签: python json pandas

我一直试图像Python熊猫中的字典或json那样解压缩此文件,但它没有给我输出到数据框的能力。谁能指出我正确的方向?

0 [{'JournalLineID': 'XXX', 'AccountID': 'XXX', 'AccountCode': '200', 'AccountType': 'XXX', 'AccountName': 'XXX', 'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}, {'JournalLineID': 'XXX2', 'AccountID': 'XXX', 'AccountCode': 'XXX', 'AccountType': 'EXPENSE', 'AccountName': 'Subscriptions - Software', 'Description': 'XXXX', 'NetAmount': 400.0, 'GrossAmount': 428.0, 'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}]

当我尝试pd.DataFrame.from_records(df)时,它给我的输出按字母分开 0 [ { ' J o u r n a l ... s ' : [ ] } ] } ]

当我尝试pd.DataFrame(df)时, 这是输出: 0 [{'JournalLineID': 'XXX', 'AccountID': 'XXX', 'AccountCode': '200', 'AccountType': 'XXX', 'AccountName': 'XXX', 'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}, {'JournalLineID': 'XXX2', 'AccountID': 'XXX', 'AccountCode': 'XXX', 'AccountType': 'EXPENSE', 'AccountName': 'Subscriptions - Software', 'Description': 'XXXX', 'NetAmount': 400.0, 'GrossAmount': 428.0, 'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}]

3 个答案:

答案 0 :(得分:1)

删除前导0,然后在其余部分调用pd.DataFrame()

答案 1 :(得分:1)

只需使用pd.DataFrame()将该列表作为输入

import pandas as pd

d = [{'JournalLineID': 'e08fdfe0-560f-40f5-8e99-f239e187808b', 'AccountID': '56278544-5930-4396-b2ef-0453731c7f51', 'AccountCode': '200', 'AccountType': 'CURRLIAB', 'AccountName': 'Accounts Payable', 'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'ea68de6e-32b6-4e02-a748-f916315804b0', 'TrackingOptionID': '0b785474-a48f-4413-aba5-819db2852f10', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': '94006aa4-a890-424e-be13-9786aa58732a', 'TrackingOptionID': '64b0f33a-d541-4316-bb41-6a0c3326d7a2', 'Options': []}]}, {'JournalLineID': 'cb2e42c7-b4e5-4ebb-875d-4ece7336efe4', 'AccountID': '64754738-d650-418e-8233-f578c9d65850', 'AccountCode': '652', 'AccountType': 'EXPENSE', 'AccountName': 'Subscriptions - Software', 'Description': 'Talenox Suite Plan - 48 pax - 11 June to 11 July 2020', 'NetAmount': 400.0, 'GrossAmount': 428.0, 'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'ea68de6e-32b6-4e02-a748-f916315804b0', 'TrackingOptionID': '0b785474-a48f-4413-aba5-819db2852f10', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': '94006aa4-a890-424e-be13-9786aa58732a', 'TrackingOptionID': '64b0f33a-d541-4316-bb41-6a0c3326d7a2', 'Options': []}]}, {'JournalLineID': '873e394f-10c1-4366-bad3-7521d1ff5957', 'AccountID': '50647912-37a6-4fd0-8717-7373f9ca32e0', 'AccountCode': '205', 'AccountType': 'CURRLIAB', 'AccountName': 'GST/VAT Control A/c', 'Description': '', 'NetAmount': 28.0, 'GrossAmount': 28.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'ea68de6e-32b6-4e02-a748-f916315804b0', 'TrackingOptionID': '0b785474-a48f-4413-aba5-819db2852f10', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': '94006aa4-a890-424e-be13-9786aa58732a', 'TrackingOptionID': '64b0f33a-d541-4316-bb41-6a0c3326d7a2', 'Options': []}]}]

df = pd.DataFrame(d)
df

Out[11]: 
                          JournalLineID  ...       TaxName
0  e08fdfe0-560f-40f5-8e99-f239e187808b  ...           NaN
1  cb2e42c7-b4e5-4ebb-875d-4ece7336efe4  ...  Purchases 7%
2  873e394f-10c1-4366-bad3-7521d1ff5957  ...           NaN

[3 rows x 12 columns]

答案 2 :(得分:0)

我认为您的第二种方法就差不多了。

import pandas as pd

Dictionary = [{'JournalLineID': 'XXX', 'AccountID': 'XXX', 'AccountCode': '200', 'AccountType': 'XXX', 'AccountName': 'XXX', \
               'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, \
               'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', \
                                       'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', \
                                                                                   'TrackingCategoryID': 'XXX', \
                                                                                   'TrackingOptionID': 'XXX', 'Options': []}]}, \
              {'JournalLineID': 'XXX2', 'AccountID': 'XXX', 'AccountCode': 'XXX', 'AccountType': 'EXPENSE', \
               'AccountName': 'Subscriptions - Software', 'Description': 'XXXX', 'NetAmount': 400.0, 'GrossAmount': 428.0, \
               'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', \
               'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', \
                                       'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', \
                                                                                   'TrackingCategoryID': 'XXX', \
                                                                                   'TrackingOptionID': 'XXX', 'Options': []}]}]
df = pd.DataFrame(Dictionary)
df

输出如下:

    AccountCode AccountID   AccountName AccountType Description GrossAmount JournalLineID   NetAmount   TaxAmount   TaxName TaxType TrackingCategories
0   200 XXX XXX XXX     -428.0  XXX -428.0  0.0 NaN NaN [{'Name': 'Location', 'Option': 'SG', 'Trackin...
1   XXX XXX Subscriptions - Software    EXPENSE XXXX    428.0   XXX2    400.0   28.0    Purchases 7%    INPUT   [{'Name': 'Location', 'Option': 'SG', 'Trackin...
相关问题