如何在numpy中将功能应用于滚动窗口?

时间:2019-06-16 23:05:03

标签: python numpy

我有一系列点(x,y)数据,并且我想使用三个点的滚动窗口。我想对每个窗口应用一个功能,基本上是映射滚动窗口。如何在numpy中执行此操作?

2 个答案:

答案 0 :(得分:1)

我认为您可以执行此类操作的最快方法是,使数组的三个副本全部相对于彼此偏移一个。例如:

In [1]: a = np.arange(12)

In [2]: a
Out[2]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [3]: np.vstack((a,np.roll(a,-1),np.roll(a,-2))).T[:-2]
Out[3]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [ 2,  3,  4],
       [ 3,  4,  5],
       [ 4,  5,  6],
       [ 5,  6,  7],
       [ 6,  7,  8],
       [ 7,  8,  9],
       [ 8,  9, 10],
       [ 9, 10, 11]])

,然后可以使用功能在最后一个轴上进行操作。例如,要计算滚动总和:

def window_function(a):
    return np.sum(a,axis=-1)

>>> a = np.arange(12)
>>> map(window_function,[a[i:i+3] for i in range(len(a)-2)])
[3, 6, 9, 12, 15, 18, 21, 24, 27, 30]
>>> window_function(np.vstack((a,np.roll(a,-1),np.roll(a,-2))).T[:-2])
array([ 3,  6,  9, 12, 15, 18, 21, 24, 27, 30])

这可以通过一个函数来概括:

def get_rolling_window(a,size):
    return np.vstack(np.roll(a,-i) for i in range(size)).T[:-size+1]

答案 1 :(得分:0)

numpy中有一种称为n=3 np.convolve(a, np.ones((n,)), mode='valid') array([ 3., 6., 9., 12., 15., 18., 21., 24., 27., 30.]) 的方法,例如,如果您需要求和

df = pd.read_csv("train.csv", header=0)

df = df[["PassengerId", "Survived", "Sex", "Age", "Embarked"]]
df.dropna(inplace=True)

X = df[["Sex", "Age"]]
X_train = np.array(X)

Y = df["Survived"]
Y_train = np.array(Y)

clf = LogisticRegression()
clf.fit(X_train, Y_train)

df1 = pd.read_csv("test.csv", header=0)
df1 = df1[["PassengerId", "Survived", "Sex", "Age", "Embarked"]]
df1.dropna(inplace=True)

X = df1[["Sex", "Age"]]
X_test = np.array(X)

Y = df1["Survived"]
Y_test = np.array(Y)
X_test = X_test.astype(float)
Y_test = Y_test.astype(float)
#to convert string data to float
accuracy = clf.score(X_test, Y_test)
print("Accuracy = ", accuracy)