pandas数据帧的列表列表

时间:2017-10-15 04:06:21

标签: python pandas

我正在尝试将嵌套列表列表解析为pandas数据帧。

这是列表的一个示例:

>>>result[1]
{
    "account_currency": "BRL",
    "account_id": "1600343406676896",
    "account_name": "aaa",
    "buying_type": "AUCTION",
    "campaign_id": "aaa",
    "campaign_name": "aaaL",
    "canvas_avg_view_percent": "0",
    "canvas_avg_view_time": "0",
    "clicks": "1",
    "cost_per_total_action": "8.15",
    "cpm": "60.820896",
    "cpp": "61.278195",
    "date_start": "2017-10-08",
    "date_stop": "2017-10-15",
    "device_platform": "desktop",
    "frequency": "1.007519",
    "impression_device": "desktop",
    "impressions": "134",
    "inline_link_clicks": "1",
    "inline_post_engagement": "1",
    "objective": "CONVERSIONS",
    "outbound_clicks": [
        {
            "action_type": "outbound_click",
            "value": "1"
        }
    ],
    "platform_position": "feed",
    "publisher_platform": "facebook",
    "reach": "133",
    "social_clicks": "1",
    "social_impressions": "91",
    "social_reach": "90",
    "spend": "8.15",
    "total_action_value": "0",
    "total_actions": "1",
    "total_unique_actions": "1",
    "unique_actions": [
        {
            "action_type": "landing_page_view",
            "value": "1"
        },
        {
            "action_type": "link_click",
            "value": "1"
        },
        {
            "action_type": "page_engagement",
            "value": "1"
        },
        {
            "action_type": "post_engagement",
            "value": "1"
        }
    ],
    "unique_clicks": "1",
    "unique_inline_link_clicks": "1",
    "unique_outbound_clicks": [
        {
            "action_type": "outbound_click",
            "value": "1"
        }
    ],
    "unique_social_clicks": "1"
}

当我将其转换为pandas数据帧时,我得到:

>>>df = pd.DataFrame(result)
>>>df
....

 unique_actions  \
NaN   
[{u'value': u'1', u'action_type': u'landing_pa...   
NaN   
[{u'value': u'2', u'action_type': u'landing_pa...   
[{u'value': u'4', u'action_type': u'landing_pa...   
 NaN   

唯一操作和其他过滤器未规范化。

如何将其标准化为相同的粒度?

2 个答案:

答案 0 :(得分:1)

您可以使用json_normalize,如下所示:

pd.io.json.json_normalize(df.unique_actions)

答案 1 :(得分:1)

考虑json_normalize将嵌套列表作为 record_path 传递,将所有其他指标作为 meta 传递。但是,因为您有多个嵌套列表,所以json携带三个数据帧的信息:

from pandas.io.json import json_normalize


merge_fields = ['account_currency', 'account_id', 'account_name', 'buying_type', 'campaign_id', 
                'campaign_name', 'canvas_avg_view_percent', 'canvas_avg_view_time', 'clicks', 
                'cost_per_total_action', 'cpm', 'cpp', 'date_start', 'date_stop', 'device_platform', 
                'frequency', 'impression_device', 'impressions', 'inline_link_clicks', 'inline_post_engagement', 
                'objective', 'platform_position', 'publisher_platform', 'reach', 'social_clicks', 'social_impressions', 
                'social_reach', 'spend', 'total_action_value', 'total_actions', 'total_unique_actions',
                'unique_clicks', 'unique_inline_link_clicks', 'unique_social_clicks']


unique_actions_df = json_normalize(result[1], record_path='unique_actions', meta=merge_fields)

outbound_clicks_df = json_normalize(result[1], record_path='outbound_clicks', meta=merge_fields)

unique_outbound_clicks_df = json_normalize(result[1], record_path='unique_outbound_clicks', meta=merge_fields)