Question

我有一个包含两列的csv文件，日期和价格。我想创建最近5天的“价格”最大值的第三列。不是最后5行或索引，而是5天。

“ example.csv”的内容

Date            Price
2018-07-23	124.44
2018-07-24	125.49
2018-07-25	123.26
2018-07-31	124.08
2018-08-01	125.10
2018-08-04	121.41
2018-08-05	119.17
2018-08-06	118.58

它应该像这样：

Date            Price   High5
2018-07-23	124.44  124.44
2018-07-24	125.49  125.49
2018-07-25	123.26  125.49
2018-07-31	124.08  124.08
2018-08-01	125.10  125.10
2018-08-04	121.41  125.10
2018-08-05	119.17  125.10
2018-08-06	118.58  121.41

使用此代码，我可以获得每一行的整个“ Close”列的最大值。

import pandas as pd

df = pd.read_csv('example.csv', parse_dates=True, index_col=0)
df['High5'] = df['Close'].max()

print(df)

使用此代码，我可以获取所有行的最近5天（以2018-08-06结尾）的最大值。

import pandas as pd

df = pd.read_csv('example.csv', parse_dates=True, index_col=0)
rng = pd.date_range(end='2018-08-06', periods=5, freq='D')
df['High5'] = df['Price'].loc[rng].max()

print(df['High5'])

我不希望所有行都使用相同的值。而且我知道使用固定（结束）日期是错误的。但是我不了解初学者的答案。

Answer 1

您要滚动

df=df.set_index('Date')
df.index=pd.to_datetime(df.index)
df.rolling('5 D').max()
#df=df.rolling('5 D').max().reset_index()
Out[62]: 
             Price
Date              
2018-07-23  124.44
2018-07-24  125.49
2018-07-25  125.49
2018-07-31  124.08
2018-08-01  125.10
2018-08-04  125.10
2018-08-05  125.10
2018-08-06  121.41

Python – Pandas：获取最近5天的最大值

1 个答案: