df
df = pd.DataFrame(np.random.random((2000,3)))
df['order_date'] = pd.date_range(start='1/1/2010',
periods=len(df), freq='D')
df
输出(df):
0 1 2 order_date
0 0.365432 0.305522 0.669302 2010-01-01
1 0.765919 0.093161 0.193244 2010-01-02
2 0.077184 0.039374 0.403210 2010-01-03
3 0.457787 0.188893 0.510776 2010-01-04
4 0.662214 0.003371 0.703892 2010-01-05
... ... ... ... ...
1995 0.709885 0.390519 0.888361 2015-06-19
1996 0.498479 0.719614 0.836749 2015-06-20
1997 0.808569 0.123956 0.050519 2015-06-21
1998 0.258573 0.663157 0.471312 2015-06-22
1999 0.018572 0.708157 0.931464 2015-06-23
代码以提取2010年至2015年所有月份的数据
for y in range(2010,2015):
for x in range(1,13):
df2 = df[((df['order_date']).dt.strftime('%m') == x)&
((df['order_date']).dt.strftime('%Y')== y)]
print('Data for period',x, y,'is\n',df2)
year=year+1
获得的输出:
Data for period 1 2010 is
Empty DataFrame
Columns: [0, 1, 2, order_date]
Index: []
Data for period 2 2010 is
Empty DataFrame
Columns: [0, 1, 2, order_date]
Index: []
...............
Data for period 12 2010 is
Empty DataFrame
Columns: [0, 1, 2, order_date]
Index: []
Data for period 1 2011 is
Empty DataFrame
Columns: [0, 1, 2, order_date]
Index: []
..............
Data for period 12 2011 is
Empty DataFrame
Columns: [0, 1, 2, order_date]
Index: []
and so on
预期输出:
我想提取日期和年份数据框。但我得到空的数据框。请帮助我。
答案 0 :(得分:1)
用Series.dt.month
或Series.dt.year
比较月份或年份:
for y in range(2010,2015):
for x in range(1,13):
df2 = df[(df['order_date'].dt.month == x)&(df['order_date'].dt.year== y)]
print('Data for period',x, y,'is\n',df2)
或通过str
将标量转换为字符串:
for y in range(2010,2015):
for x in range(1,13):
df2 = df[((df['order_date']).dt.strftime('%m') == str(x))&
((df['order_date']).dt.strftime('%Y')== str(y))]
print('Data for period',x, y,'is\n',df2)
答案 1 :(得分:0)
将循环更改为
for y in range(2010, 2015):
for x in range(1, 13):
df2 = df[df["order_date"].dt.month.eq(x) & df["order_date"].dt.year.eq(y)]
print("Data for period", x, y, "is\n", df2)