我想逐月切片DataFrame df 。如何导入numpy作为np
import pandas as pd
df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2017-1-1', periods=200, freq='D')
mask = (df['date'] == pd.to_datetime('2017-06')) # ??? all rows for JUNE ???
print(df.loc[mask])
答案 0 :(得分:3)
如果需要仅按月和年份进行比较并不重要,请使用dt.month
:
mask = (df['date'].dt.month == pd.to_datetime('2017-06').month)
#same as
#mask = (df['date'].dt.month == 6)
print(df.loc[mask])
0 1 2 date
151 0.667722 0.421487 0.338626 2017-06-01
152 0.712709 0.984242 0.419231 2017-06-02
153 0.509679 0.319629 0.651422 2017-06-03
154 0.987976 0.937703 0.278857 2017-06-04
...
但如果需要按月份进行比较,years
和month
很重要,请使用to_period
:
mask = (df['date'].dt.to_period('M') == pd.to_datetime('2017-06').to_period('M'))
print(df.loc[mask])
0 1 2 date
151 0.702137 0.873511 0.458284 2017-06-01
152 0.809441 0.888400 0.350705 2017-06-02
153 0.425821 0.712912 0.339203 2017-06-03
154 0.151374 0.154301 0.923882 2017-06-04
...
datetimeindex partial string indexing的解决方案:
df = df.set_index('date')
print(df.loc['2017-06'])
0 1 2
date
2017-06-01 0.785634 0.496983 0.786512
2017-06-02 0.280444 0.091523 0.468411
2017-06-03 0.429112 0.510265 0.885642
2017-06-04 0.037233 0.034625 0.515339
2017-06-05 0.863211 0.632449 0.396963
2017-06-06 0.550682 0.975060 0.182594
...