创建摘要数据框? ("折叠"数据到相似月份)

时间:2018-04-10 15:10:22

标签: python-3.x pandas

我有一个包含大量数据的.csv,通常如下所示:

Customer    City    Month      Amount
Wayne E    Gotham   January    111
Wayne E    Gotham   January    222
Wayne E    Chicago  March      392
Wayne E    Buffalo  June       2928
Clark K    Krypton  January    100
Clark K    Amman    February   200
Clark K    Detroit  February   300

我尝试创建一个摘要数据框,列出每个客户,然后列出他们所在的唯一城市,然后sum列出该月的Amount

因此,对于上述内容,我希望我的输出看起来像:

Customer    City    January February    March   April   May    June    ...    December
Wayne E    Gotham   333                 
Wayne E    Chicago                      392         
Wayne E    Buffalo                                             2928
Clark K    Krypton  100                 
Clark K    Amman            200             
Clark K    Detroit          200             

到目前为止,我已经能够获得独特的客户和城市,但我正在努力如何填充月份列。我甚至不确定我是否以最佳方式设置了我的摘要数据框架,所以我已经想到了它。

这是我到目前为止所拥有的:

df = pd.read_csv("mycsv.csv", encoding='cp1252')
customers = df["Customer"].unique()
cities = df["City"].unique()

summary_df = pd.DataFrame(columns=["Assured","Facility", "January","February","March","April","May","June","July","August","September", "October", "November","December"])

1 个答案:

答案 0 :(得分:1)

您在寻找pivot吗?

df.pivot_table(index=['Customer','City'],columns='Month',values='Amount').reindex(columns=['January','February','March','April',   'May','June']).fillna('').reset_index()
Out[83]: 
Month Customer     City January February March April May  June
0       ClarkK    Amman              200                      
1       ClarkK  Detroit              300                      
2       ClarkK  Krypton     100                               
3       WayneE  Buffalo                                   2928
4       WayneE  Chicago                    392                
5       WayneE   Gotham   166.5