访问Pandas.Dataframe中的邻居行

时间:2014-06-12 07:07:05

标签: python pandas

我正在尝试计算一系列数据的局部最大值和最小值:如果当前行值大于或低于后续行和前一行,则将其设置为当前值,否则设置为NaN。有没有更优雅的方式去做,除了这一个:

import pandas as pd
import numpy as np

rng = pd.date_range('1/1/2014', periods=10, freq='5min')
s = pd.Series([1, 2, 3, 2, 1, 2, 3, 5, 7, 4], index=rng)
df = pd.DataFrame(s, columns=['val'])
df.index.name = "dt"
df['minmax'] = np.NaN

for i in range(len(df.index)):
    if i == 0:
        continue
    if i == len(df.index) - 1:
        continue
    if df['val'][i] >= df['val'][i - 1] and df['val'][i] >= df['val'][i + 1]:
        df['minmax'][i] = df['val'][i]
        continue
    if df['val'][i] <= df['val'][i - 1] and df['val'][i] <= df['val'][i + 1]:
        df['minmax'][i] = df['val'][i]
        continue

print(df)

结果是:

                     val  minmax
dt                              
2014-01-01 00:00:00    1     NaN
2014-01-01 00:05:00    2     NaN
2014-01-01 00:10:00    3       3
2014-01-01 00:15:00    2     NaN
2014-01-01 00:20:00    1       1
2014-01-01 00:25:00    2     NaN
2014-01-01 00:30:00    3     NaN
2014-01-01 00:35:00    5     NaN
2014-01-01 00:40:00    7       7
2014-01-01 00:45:00    4     NaN

1 个答案:

答案 0 :(得分:0)

我们可以使用shiftwhere来确定分配值的内容,重要的是我们在比较系列时必须使用位比较器&|Shift将返回移位1行(默认值)或传递值的Series或DataFrame。

使用where时,我们可以传递布尔条件,第二个参数NaN告诉它在False时分配此值。

In [81]:

df['minmax'] = df['val'].where(((df['val'] < df['val'].shift(1))&(df['val'] < df['val'].shift(-1)) | (df['val'] > df['val'].shift(1))&(df['val'] > df['val'].shift(-1))), NaN)
df
Out[81]:
                     val  minmax
dt                              
2014-01-01 00:00:00    1     NaN
2014-01-01 00:05:00    2     NaN
2014-01-01 00:10:00    3       3
2014-01-01 00:15:00    2     NaN
2014-01-01 00:20:00    1       1
2014-01-01 00:25:00    2     NaN
2014-01-01 00:30:00    3     NaN
2014-01-01 00:35:00    5     NaN
2014-01-01 00:40:00    7       7
2014-01-01 00:45:00    4     NaN