Pandas中的数据透视表计数聚合()

时间:2016-06-30 10:18:49

标签: python-2.7 pandas matplotlib

我试图根据项目计算“STAGE”,我使用np.size作为aggfunc,但它返回包含项目的出现次数,如果预期计数为3则我的计数值变为double表示它返回6

enter image description here

我使用了以下代码

record_id  abbreviation  patient_id  study_id  step_count  distance  ambulation_time  velocity  cadence  normalized_velocity  step_time_differential  step_length_differential  cycle_time_differential  step_time  step_length  step_extremity  cycle_time  stride_length  hh_base_support  swing_time  stance_time  single_supp_time  double_supp_time  toe_in_out 
1                                              3           292,34    1,67             175,1     107,8                         0,004                   1,051                     0,008                    0,56       97,27                        1,11        194,64         4,65             0,47        0,65         0,47              0,18              1,45

1 个答案:

答案 0 :(得分:1)

您需要汇总功能len

print (data_frame)
  Project Stage
0      an    ip
1     cfc    pe
2      an    ip
3      ap    pe
4     cfc    pe
5      an    ip
6     cfc    ip

df = pd.pivot_table(data_frame, 
                    index='Project',
                    columns='Stage', 
                    aggfunc=len, 
                    fill_value=0)
print (df)
Stage    ip  pe
Project        
an        3   0
ap        0   1
cfc       1   2

size的另一个解决方案:

df = pd.pivot_table(data_frame, 
                    index='Project',
                    columns='Stage', 
                    aggfunc='size', 
                    fill_value=0)
print (df)
Stage    ip  pe
Project        
an        3   0
ap        0   1
cfc       1   2

通过评论编辑:

import matplotlib.pyplot as plt
#all code

df.plot.bar()
plt.show()

graph