我下面有一个熊猫数据框df
;
pay pay2 date
0 209.070007 208.000000 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
3 208.880005 207.279999 2018-08-09
鉴于有last_date
,我想从n
开始检索过去的last_date
行,包括last_date
本身的行。
例如,给定last_date = '2018-08-08'
和n=3
。结果数据框应如下所示;
pay pay2 date
0 209.070007 208.000000 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
答案 0 :(得分:2)
使用
In [285]: s = df.date.eq(last_date)
In [286]: loc = s.index[s][-1] # or s.idxmax()
In [287]: df.loc[loc-3:loc]
Out[287]:
pay pay2 date
0 209.070007 208.000000 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
详细信息
In [288]: s
Out[288]:
0 False
1 False
2 True
3 False
Name: date, dtype: bool
In [289]: loc
Out[289]: 2
答案 1 :(得分:2)
我会做的简短些。首先将字符串转换为日期时间。
last_date = '2018-08-08'
last_date = pd.to_datetime(last_date)
n = 2
df.loc[(df['date'] <= last_date)].loc[:n-1]
# pay pay2 date
# 0 209.070007 208.000000 2018-08-06
# 1 207.110001 209.320007 2018-08-07
答案 2 :(得分:1)
您需要:
df = pd.DataFrame({'pay':['209.07','207.110001','207.250000','208.880005'],
'pay2':['208','209.320007','206.050003','207.279999'],
'date':['2018-08-06','2018-08-07','2018-08-08','2018-08-09']})
last_date = pd.to_datetime('2018-08-08')
n= 3
df['date'] =pd.to_datetime(df['date'])
df_new = df[df['date']<=last_date].sort_values("date", ascending=False)
df_new = df_new[:n].sort_values("date", ascending=True)
print(df_new)
输出:
pay pay2 date
0 209.07 208 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08