假设我有一个形状为n X m的二维numpy数组(其中n是大数且m> = 1)。每列代表一个属性。下面提供了n = 5,m = 3的示例:
[[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]]
我想用history_steps = p(1< p< = n)训练我的模型在属性的历史上。对于p = 2,我预期的输出(形状(n-p + 1×m * p))是
[[1,4,2,5,3,6],
[4,7,5,8,6,9],
[7,10,8,11,9,12],
[10,13,11,14,12,15]]
我尝试通过分离列然后连接输出来在pandas中实现它。
def buff(s, n):
return (pd.concat([s.shift(-i) for i in range(n)], axis=1).dropna().astype(float))
但是,就我的目的而言,基于numpy的方法会更好。另外,我想避免分裂和连接。
我该怎么做?
答案 0 :(得分:2)
这是一种基于NumPy的方法,使用np.lib.stride_tricks.as_strided
-
@FXML
public void generateButton(ActionEvent event) {
String fingerprint = fingerprintText.getText().toLowerCase();
String erg = Verifier.getDdProUnlockPIN(fingerprint);
pinField.setText(erg);
copyText = erg;
log.info("Pin " + erg + "wird generiert");
}
示例运行 -
def strided_axis0(a, L = 2):
# INPUTS :
# a : Input array
# L : Length along rows to be cut to create per subarray
# Store shape and strides info
m,n = a.shape
s0,s1 = a.strides
nrows = m - L + 1
strided = np.lib.stride_tricks.as_strided
# Finally use strides to get the 3D array view and then reshape
return strided(a, shape=(nrows,n,L), strides=(s0,s1,s0)).reshape(nrows,-1)
答案 1 :(得分:1)
您可以使用dstack
+ reshape
:
a = np.array([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]])
# use `dstack` to stack the two arrays(one with last row removed, the other with first
# row removed), along the third axis, and then use reshape to flatten the second and third
# dimensions
np.dstack([a[:-1], a[1:]]).reshape(a.shape[0]-1, -1)
#array([[ 1, 4, 2, 5, 3, 6],
# [ 4, 7, 5, 8, 6, 9],
# [ 7, 10, 8, 11, 9, 12],
# [10, 13, 11, 14, 12, 15]])
要概括为任意p
,请使用列表推导生成已移位数组的列表,然后执行stack+reshape
:
n, m = a.shape
p = 3
np.dstack([a[i:(n-p+i+1)] for i in range(p)]).reshape(n-p+1, -1)
#array([[ 1, 4, 7, 2, 5, 8, 3, 6, 9],
# [ 4, 7, 10, 5, 8, 11, 6, 9, 12],
# [ 7, 10, 13, 8, 11, 14, 9, 12, 15]])