如何在数据透视表上对值进行分组?

时间:2018-10-05 05:00:21

标签: python pandas

我是python新手,我很难完成一项任务,希望我的问题不是很愚蠢。

我导出了CSV文件,其中的数据组织如下例: Table example

Company City    Company Country Accelerator $   Accelerator Date    Angel $ Angel Date  Seed $  Seed Date   Series A $  Series A Date
    United Kingdom  0   7/3/2017    0   1/0/1900    0.0 1/0/1900    0.0 1/0/1900
Roubaix France  0.02    9/1/2016    0   1/0/1900    0.0 1/0/1900    2.15    11/2/2015
Montpellier France  0   12/4/2014   0   1/0/1900    0.0 1/0/1900    0.0 1/0/1900
Beijing China   0   1/0/1900    0   1/0/1900    0.0 1/0/1900    16.0    2/7/2018

我需要以这种方式组织数据: enter image description here

    2014    2015    2016    2017
Angel    $4,690,000      $4,150,000      $16,683,000     $6,520,000 
Seed     $17,890,000     $35,590,000     $53,860,000     $24,700,000 
Series A     $49,500,000     $123,430,000    $110,810,000    $123,220,000 

如果你们能帮助我,我将非常高兴!

1 个答案:

答案 0 :(得分:0)

您可以使用:

#create MultiIndex from columns
df = df.set_index(['Company City','Company Country'])
#or remove columns
#df = df.drop(['Company City','Company Country'], axis=1)
#create MultiIndex in columns by split from right by first whitespace
df.columns = df.columns.str.rsplit(n=1, expand=True)
#reshape to 2 column df
df = df.stack(0)
#extract year by last 4 letters
df['Date'] = df['Date'].str[-4:].astype(int)
#pivoting
df = df.reset_index().pivot_table(index='level_2',columns='Date',values='$', aggfunc='sum')
print (df)

Date         1900  2014  2015  2016  2017  2018
level_2                                        
Accelerator   0.0   0.0   NaN  0.02   0.0   NaN
Angel         0.0   NaN   NaN   NaN   NaN   NaN
Seed          0.0   NaN   NaN   NaN   NaN   NaN
Series A      0.0   NaN  2.15   NaN   NaN  16.0