'record_path'中的字典在json_normalize中不会变平

时间:2018-08-01 22:10:49

标签: pandas

下面是json数据结构,我正在尝试将其转换为CSV

[{
            "ASIN": "B0773V2Z6",
            "Condition": "NewItem",
            "EarliestAvailability": {
                "TimepointType": "Immediately"
            },
            "FNSKU": "B0773V2Z6",
            "InStockSupplyQuantity": "18",
            "SellerSKU": "30237",
            "SupplyDetail.member": [
                    {
                        "EarliestAvailableToPick": {
                            "TimepointType": "Immediately"
                        },
                        "LatestAvailableToPick": {
                            "TimepointType": "Immediately"
                        },
                        "Quantity": "1",
                        "SupplyType": "InStock"
                    },
                    {
                        "EarliestAvailableToPick": {
                            "TimepointType": "Immediately"
                        },
                        "LatestAvailableToPick": {
                            "TimepointType": "Immediately"
                        },
                        "Quantity": "1",
                        "SupplyType": "InStock"
                    }
           ],
           "TotalSupplyQuantity": "18",            
}]

我尝试如下使用pandas lib中的json_normalize

df = json_normalize(json_data, record_path="SupplyDetail.member", meta=["ASIN"], errors='ignore')

它给出以下结果

EarliestAvailableToPick              LatestAvailableToPick              ASIN
{'TimepointType': 'Immediately'}    {'TimepointType': 'Immediately'}    B0773V2Z6T
{'TimepointType': 'Immediately'}    {'TimepointType': 'Immediately'}    B0773V2Z6T

我需要结果

EarliestAvailableToPick.TimepointType   LatestAvailableToPick.TimepointType   ASIN
'Immediately'                           'Immediately'                         B0773V2Z6T
'Immediately'                           'Immediately'                         B0773V2Z6T

我知道,如果字典位于json的第一级,则json_normalize将其扁平化。但是,如果我们使用的是'record_path',则它不会使该路径下的字典变平! 请帮助

2 个答案:

答案 0 :(得分:1)

这是我为达成所需解决方案所做的事情。

<div class="watch-video-svg"> <svg xmlns="http://www.w3.org/2000/svg" height="100%" width="100%" viewBox="-5 0 120 120"> <defs> <filter id="shadow" x="-20%" y="-20%" width="140%" height="140%"> <feDropShadow dx="4" dy="8" stdDeviation="4"/> </filter> </defs> <circle cx="50" cy="50" r="50" fill="red" filter=url(#shadow) /> <polygon points="31,20, 31,77, 80,50" fill="white"/> </svg> </div>

结果( df = json_normalize(json_data, record_path="SupplyDetail.member", meta=["ASIN"], errors='ignore') re_data = df.to_json(orient='records') df_new = json_normalize(json.loads(re_data)) )为:

df_new

答案 1 :(得分:-1)

作为快速而肮脏的修复程序,您可以执行以下操作:

df['EarliestAvailableToPick.TimepointType'] = df.EarliestAvailableToPick.map(lambda d: d['TimepointType'])
df['LatestAvailableToPick.TimepointType'] = df.LatestAvailableToPick.map(lambda d: d['TimepointType'])