使用pandas

时间:2016-05-27 09:24:54

标签: python python-2.7 dictionary pandas

我有两个dicts列表:一个包含月度数据,另一个包含季度数据,如下所示:

monthly = [
{
    "name": "Boston",
    "month": "2015-May",  
    "total_monthly": 2
}, 
{
    "name": "Boston",
    "month": "2015-June",  
    "total_monthly": 8
}, 
{
    "name": "Chicago",
    "month": "2015-May",  
    "total_monthly": 10
},
{
    "name": "Chicago",
    "month": "2015-June",  
    "total_monthly": 13
}
]

quarterly =[
{
    "name": "Boston",
    "quarter": "2015-Q1",  
    "total_quarterly": 23
}, 
{
    "name": "Boston",
    "quarter": "2015-Q2",  
    "total_quarterly": 24
}, 
{
    "name": "Chicago",
    "quarter": "2015-Q1",  
    "total_quarterly": 40
},
{
    "name": "Chicago",
    "quarter": "2015-Q2",  
    "total_quarterly": 32
}
]

传统上,我可以遍历列表并根据常用名称合并它们。但是,如何使用Pandas实现合并数据?

merged = [
{
  "name": "Boston",
  "trend_monthly" : [
    {
      "month": "2015-May",  
      "total_monthly": 2
    }, 
    {
      "month": "2015-June",  
      "total_monthly": 8
    },
  ],  
  "trend_quarterly" : [
    {
      "quarter": "2015-Q1",  
      "total_quarterly": 23
    }, 
    {
      "quarter": "2015-Q2",  
      "total_quarterly": 24
    },
  ]
},
{
  "name": "Chicago",
  "trend_monthly" : [
    {
      "month": "2015-May",  
      "total_monthly": 10
    }, 
    {
      "month": "2015-June",  
      "total_monthly": 13
    },
  ],  
  "trend_quarterly" : [
    {
      "quarter": "2015-Q1",  
      "total_quarterly": 40
    }, 
    {
      "quarter": "2015-Q2",  
      "total_quarterly": 32
    },
  ]
}]          

1 个答案:

答案 0 :(得分:0)

你必须做这样的事情:

import pandas as pd

df_monthly = pd.DataFrame(monthly)
df_quarterly = pd.DataFrame(quarterly)

df = pd.concat([df_monthly, df_quarterly])

# This part does not group correctly, please edit for your needs
result = []
dict_monthly = dict(list(df[df.month.notnull()][['name',
                                                  'month',
                                                  'total_monthly']
                                                ].groupby(by='name')))

dict_quarterly = dict(list(df[df.quarter.notnull()][['name',
                                                  'quarter',
                                                  'total_quarterly']
                                                ].groupby(by='name')))
result.append(dict_monthly)
result.append(dict_quarterly)