我一直试图像Python熊猫中的字典或json那样解压缩此文件,但它没有给我输出到数据框的能力。谁能指出我正确的方向?
0 [{'JournalLineID': 'XXX', 'AccountID': 'XXX', 'AccountCode': '200', 'AccountType': 'XXX', 'AccountName': 'XXX', 'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}, {'JournalLineID': 'XXX2', 'AccountID': 'XXX', 'AccountCode': 'XXX', 'AccountType': 'EXPENSE', 'AccountName': 'Subscriptions - Software', 'Description': 'XXXX', 'NetAmount': 400.0, 'GrossAmount': 428.0, 'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}]
当我尝试pd.DataFrame.from_records(df)
时,它给我的输出按字母分开
0 [ { ' J o u r n a l ... s ' : [ ] } ] } ]
当我尝试pd.DataFrame(df)
时,
这是输出:
0 [{'JournalLineID': 'XXX', 'AccountID': 'XXX', 'AccountCode': '200', 'AccountType': 'XXX', 'AccountName': 'XXX', 'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}, {'JournalLineID': 'XXX2', 'AccountID': 'XXX', 'AccountCode': 'XXX', 'AccountType': 'EXPENSE', 'AccountName': 'Subscriptions - Software', 'Description': 'XXXX', 'NetAmount': 400.0, 'GrossAmount': 428.0, 'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': 'XXX', 'TrackingOptionID': 'XXX', 'Options': []}]}]
答案 0 :(得分:1)
删除前导0,然后在其余部分调用pd.DataFrame()
。
答案 1 :(得分:1)
只需使用pd.DataFrame()
将该列表作为输入
import pandas as pd
d = [{'JournalLineID': 'e08fdfe0-560f-40f5-8e99-f239e187808b', 'AccountID': '56278544-5930-4396-b2ef-0453731c7f51', 'AccountCode': '200', 'AccountType': 'CURRLIAB', 'AccountName': 'Accounts Payable', 'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'ea68de6e-32b6-4e02-a748-f916315804b0', 'TrackingOptionID': '0b785474-a48f-4413-aba5-819db2852f10', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': '94006aa4-a890-424e-be13-9786aa58732a', 'TrackingOptionID': '64b0f33a-d541-4316-bb41-6a0c3326d7a2', 'Options': []}]}, {'JournalLineID': 'cb2e42c7-b4e5-4ebb-875d-4ece7336efe4', 'AccountID': '64754738-d650-418e-8233-f578c9d65850', 'AccountCode': '652', 'AccountType': 'EXPENSE', 'AccountName': 'Subscriptions - Software', 'Description': 'Talenox Suite Plan - 48 pax - 11 June to 11 July 2020', 'NetAmount': 400.0, 'GrossAmount': 428.0, 'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'ea68de6e-32b6-4e02-a748-f916315804b0', 'TrackingOptionID': '0b785474-a48f-4413-aba5-819db2852f10', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': '94006aa4-a890-424e-be13-9786aa58732a', 'TrackingOptionID': '64b0f33a-d541-4316-bb41-6a0c3326d7a2', 'Options': []}]}, {'JournalLineID': '873e394f-10c1-4366-bad3-7521d1ff5957', 'AccountID': '50647912-37a6-4fd0-8717-7373f9ca32e0', 'AccountCode': '205', 'AccountType': 'CURRLIAB', 'AccountName': 'GST/VAT Control A/c', 'Description': '', 'NetAmount': 28.0, 'GrossAmount': 28.0, 'TaxAmount': 0.0, 'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'ea68de6e-32b6-4e02-a748-f916315804b0', 'TrackingOptionID': '0b785474-a48f-4413-aba5-819db2852f10', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', 'TrackingCategoryID': '94006aa4-a890-424e-be13-9786aa58732a', 'TrackingOptionID': '64b0f33a-d541-4316-bb41-6a0c3326d7a2', 'Options': []}]}]
df = pd.DataFrame(d)
df
Out[11]:
JournalLineID ... TaxName
0 e08fdfe0-560f-40f5-8e99-f239e187808b ... NaN
1 cb2e42c7-b4e5-4ebb-875d-4ece7336efe4 ... Purchases 7%
2 873e394f-10c1-4366-bad3-7521d1ff5957 ... NaN
[3 rows x 12 columns]
答案 2 :(得分:0)
我认为您的第二种方法就差不多了。
import pandas as pd
Dictionary = [{'JournalLineID': 'XXX', 'AccountID': 'XXX', 'AccountCode': '200', 'AccountType': 'XXX', 'AccountName': 'XXX', \
'Description': '', 'NetAmount': -428.0, 'GrossAmount': -428.0, 'TaxAmount': 0.0, \
'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', \
'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', \
'TrackingCategoryID': 'XXX', \
'TrackingOptionID': 'XXX', 'Options': []}]}, \
{'JournalLineID': 'XXX2', 'AccountID': 'XXX', 'AccountCode': 'XXX', 'AccountType': 'EXPENSE', \
'AccountName': 'Subscriptions - Software', 'Description': 'XXXX', 'NetAmount': 400.0, 'GrossAmount': 428.0, \
'TaxAmount': 28.0, 'TaxType': 'INPUT', 'TaxName': 'Purchases 7%', \
'TrackingCategories': [{'Name': 'Location', 'Option': 'SG', 'TrackingCategoryID': 'XXX', \
'TrackingOptionID': 'XXX', 'Options': []}, {'Name': 'Sales Rep/Dept', 'Option': 'HQ', \
'TrackingCategoryID': 'XXX', \
'TrackingOptionID': 'XXX', 'Options': []}]}]
df = pd.DataFrame(Dictionary)
df
输出如下:
AccountCode AccountID AccountName AccountType Description GrossAmount JournalLineID NetAmount TaxAmount TaxName TaxType TrackingCategories
0 200 XXX XXX XXX -428.0 XXX -428.0 0.0 NaN NaN [{'Name': 'Location', 'Option': 'SG', 'Trackin...
1 XXX XXX Subscriptions - Software EXPENSE XXXX 428.0 XXX2 400.0 28.0 Purchases 7% INPUT [{'Name': 'Location', 'Option': 'SG', 'Trackin...