大熊猫:计算月份的出现次数

时间:2017-05-18 09:08:07

标签: python-2.7 pandas matplotlib dataframe

我有大量行dataframe(df_m)如下,我想绘制数据框中date_m列的年份(2010-2017)的月出现次数。由于date_m的年份范围是2010年至2017年。

 db  num           date_a     date_m   date_c zip_b  zip_a
0   old HKK10032    2010-07-14  2010-07-26  NaT NaN NaN
1   old HKK10109    2011-07-14  2011-09-15  NaT NaN NaN
2   old HNN10167    2012-07-15  2012-08-09  NaT 177-003 NaN
3   old HKK10190    2013-07-15  2013-09-02  NaT NaN NaN
4   old HKK10251    2014-07-16  2014-05-02  NaT NaN NaN
5   old HKK10253    2015-07-16  2015-05-01  NaT NaN NaN
6   old HNN10275    2017-07-16  2017-07-18  2010-07-18  1070062 NaN
7   old HKK10282    2017-07-16  2017-08-16  NaT NaN NaN
............................................................ 

首先,我抽象每月(1-20)的月份发生(2010-2017)。但是我的代码中存在错误:

lst_all = []
for i in range(2010, 2018):
    lst_num = [sum(df_m.date_move.dt.month == j & df_m.date_move.dt.year == i) for j in range(1, 13)]
    lst_all.append(lst_num)
print lst_all

1 个答案:

答案 0 :(得分:1)

您需要将()添加到条件:

lst_all = []
for i in range(2010, 2018):
    lst_num = [((df_m.date_m.dt.month == j) & (df_m.date_m.dt.year == i)).sum() for j in range(1, 13)]
    lst_all.append(lst_num)

然后得到:

df1 = pd.DataFrame(lst_all, index=range(2010, 2018), columns=range(1, 13))
print (df1)
      1   2   3   4   5   6   7   8   9   10  11  12
2010   0   0   0   0   0   0   1   0   0   0   0   0
2011   0   0   0   0   0   0   0   0   1   0   0   0
2012   0   0   0   0   0   0   0   1   0   0   0   0
2013   0   0   0   0   0   0   0   0   1   0   0   0
2014   0   0   0   0   1   0   0   0   0   0   0   0
2015   0   0   0   0   1   0   0   0   0   0   0   0
2016   0   0   0   0   0   0   0   0   0   0   0   0
2017   0   0   0   0   0   0   1   1   0   0   0   0