假设我有一个数据框:
import pandas as pd
from random import randint
df = pd.DataFrame({'A': [randint(1, 9) for x in xrange(1000)],
'B': ...,
'C':....})
我要选择满足以下条件的行:如果至少X个连续的相邻行(在任一方向上)具有满足以下条件的A值,则选择行:abs(myRowAValue-meanAValueOfTheXNeighbors)
换句话说,我想在A值相当恒定的地方选择行。
我正在寻找最有效的“熊猫”方式。 感谢您的帮助。
答案 0 :(得分:0)
我不确定您的预期结果会是什么样,但是请问这是否可以帮到您(可能不是最高效的,也不是所有的熊猫):
import pandas as pd
from random import randint
import numpy as np
df = pd.DataFrame({'A': [randint(1, 9) for x in range(1000)]})
neighbours = 10
tolerance = 2
nparray = np.array(df['A'])
nparray_len = len(nparray)
fbegin = [iterator for iterator, element in enumerate(nparray) if abs(element - np.average(nparray[:iterator+neighbours])) < tolerance and iterator < neighbours]
fmid = [iterator for iterator, element in enumerate(nparray) if abs(element - np.average(nparray[iterator-neighbours:iterator+neighbours])) < tolerance and iterator >= neighbours and iterator <= nparray_len - neighbours]
fend = [iterator for iterator, element in enumerate(nparray) if abs(element - np.average(nparray[iterator-neighbours:])) < tolerance and iterator > nparray_len - neighbours]
IDS = np.unique(np.array(fbegin + fmid + fend))
nparray[IDS]
df_constant = df.iloc[IDS]
print(df_constant)