如何从大熊猫数据框中挑选几年的季节/月?

时间:2018-06-14 12:09:30

标签: pandas date-range

我有一个连续三年数据的熊猫数据框...... 有没有简单的方法来选择夏季/冬季?

目前,我这样做:

df_summer = df[((df.index.month > 5) & (df.index.month < 9))]
df_winter = df[((df.index.month < 3) | (df.index.month == 12))]

但是从xarray我习惯了这样的符号:

xa_summer = xa[xa.dt.season=='JJA']

我可以做一年:

df_summer = df['YYYYMMDDHH':'YYYYMMDDHH']

是否有更直观的方式在熊猫中挑选多年的季​​节?

...谢谢

1 个答案:

答案 0 :(得分:1)

未在pandas yet中实施。

一个想法是按season模数创建助手12,按3添加3,按map排列最后rng = pd.date_range('2017-04-03', periods=40, freq='MS') df = pd.DataFrame({'a': range(40)}, index=rng) #print (df) season = ((df.index.month % 12 + 3) // 3).map({1:'DJF', 2: 'MAM', 3:'JJA', 4:'SON'}) print (season) Index(['MAM', 'JJA', 'JJA', 'JJA', 'SON', 'SON', 'SON', 'DJF', 'DJF', 'DJF', 'MAM', 'MAM', 'MAM', 'JJA', 'JJA', 'JJA', 'SON', 'SON', 'SON', 'DJF', 'DJF', 'DJF', 'MAM', 'MAM', 'MAM', 'JJA', 'JJA', 'JJA', 'SON', 'SON', 'SON', 'DJF', 'DJF', 'DJF', 'MAM', 'MAM', 'MAM', 'JJA', 'JJA', 'JJA'], dtype='object')

df_summer = df[season == 'JJA']
print (df_summer)
             a
2017-06-01   1
2017-07-01   2
2017-08-01   3
2018-06-01  13
2018-07-01  14
2018-08-01  15
2019-06-01  25
2019-07-01  26
2019-08-01  27
2020-06-01  37
2020-07-01  38
2020-08-01  39

df_winter = df[season == 'DJF']
print (df_winter)
             a
2017-12-01   7
2018-01-01   8
2018-02-01   9
2018-12-01  19
2019-01-01  20
2019-02-01  21
2019-12-01  31
2020-01-01  32
2020-02-01  33
df_summer = df[df['season'] == 'JJA']
print (df_summer)
             a season
2017-06-01   1    JJA
2017-07-01   2    JJA
2017-08-01   3    JJA
2018-06-01  13    JJA
2018-07-01  14    JJA
2018-08-01  15    JJA
2019-06-01  25    JJA
2019-07-01  26    JJA
2019-08-01  27    JJA
2020-06-01  37    JJA
2020-07-01  38    JJA
2020-08-01  39    JJA

df_winter = df[df['season'] == 'DJF']
print (df_winter)
             a season
2017-12-01   7    DJF
2018-01-01   8    DJF
2018-02-01   9    DJF
2018-12-01  19    DJF
2019-01-01  20    DJF
2019-02-01  21    DJF
2019-12-01  31    DJF
2020-01-01  32    DJF
2020-02-01  33    DJF

或创建新列:

df_summer = df[df.index.month.isin([6,7,8])]
print (df_summer)
             a
2017-06-01   1
2017-07-01   2
2017-08-01   3
2018-06-01  13
2018-07-01  14
2018-08-01  15
2019-06-01  25
2019-07-01  26
2019-08-01  27
2020-06-01  37
2020-07-01  38
2020-08-01  39

df_winter = df[df.index.month.isin([12,1,2])]
print (df_winter)
             a
2017-12-01   7
2018-01-01   8
2018-02-01   9
2018-12-01  19
2019-01-01  20
2019-02-01  21
2019-12-01  31
2020-01-01  32
2020-02-01  33

另一个想法是按月使用Index.isin

object