如何读取此json并将其转换为DF?

时间:2019-07-31 12:11:26

标签: python json parsing nested

我想将此嵌套json转换为df。 尝试了不同的功能,但均无法正常工作。

适用于我的编码是- 编码=“ utf-8-sig”

[{'replayableActionOperationState': 'SKIPPED',
  'replayableActionOperationGuid': 'RAO_1037351',
  'failedMessage': 'Cannot replay action: RAO_1037351: com.ebay.sd.catedor.core.model.DTOEntityPropertyChange; local class incompatible: stream classdesc serialVersionUID = 7777212484705611612, local class serialVersionUID = -1785129380151507142',
  'userMessage': 'Skip all mode',
  'username': 'gfannon',
  'sourceAuditData': [{'guid': '24696601-b73e-43e4-bce9-28bc741ac117',
    'operationName': 'UPDATE_CATEGORY_ATTRIBUTE_PROPERTY',
    'creationTimestamp': 1563439725240,
    'auditCanvasInfo': {'id': '165059', 'name': '165059'},
    'auditUserInfo': {'id': 1, 'name': 'gfannon'},
    'externalId': None,
    'comment': None,
    'transactionId': '0f135909-66a7-46b1-98f6-baf1608ffd6a',
    'data': {'entity': {'guid': 'CA_2511202',
      'tagType': 'BOTH',
      'description': None,
      'name': 'Number of Shelves'},
     'propertyChanges': [{'propertyName': 'EntityProperty',
       'oldEntity': {'guid': 'CAP_35',
        'name': 'DisableAsVariant',
        'group': None,
        'action': 'SET',
        'value': 'true',
        'tagType': 'SELLER'},
       'newEntity': {'guid': 'CAP_35',
        'name': 'DisableAsVariant',
        'group': None,
        'action': 'SET',
        'value': 'false',
        'tagType': 'SELLER'}}],
     'entityChanges': None,
     'primary': True}}],
  'targetAuditData': None,
  'conflictedGuids': None,
  'fatal': False}]

这是我到目前为止尝试过的,还有更多尝试,但这使我尽可能地接近。

with open(r"Desktop\Ann's json parsing\report.tsv", encoding='utf-8-sig') as data_file:    
    data = json.load(data_file)  
    df = json_normalize(data)
    print (df)

pd.DataFrame(df) ## The nested lists are shown as a whole column, im trying to parse those colums - 'failedMessage' and 'sourceAuditData'`I also tried json.loads/json(df) but the output isnt correct.

pd.DataFrame.from_dict(a['sourceAuditData'][0]['data']['propertyChanges'][0]) ##This line will retrive one of the outputs i need but i dont know how to perform it on the whole file.

预期结果应该是一个csv / xlsx文件,其中包含一列,每一行都有一个值。 Desired output

1 个答案:

答案 0 :(得分:0)

对于您的特定示例:

def unroll_dict(d):
    data = []
    for k, v in d.items():
        if isinstance(v, list):
            data.append((k, ''))
            data.extend(unroll_dict(v[0]))
        elif isinstance(v, dict):
            data.append((k, ''))
            data.extend(unroll_dict(v))
        else: 
            data.append((k,v))
    return data

鉴于问题中的数据存储在变量example中:

df = pd.DataFrame(unroll_dict(example[0])).set_index(0).transpose()