I am trying to implement scikit-learn's PolynomialFeatures as a layer in a feedforward neural network in tensorflow and Keras. I'll give an example using NumPy arrays for the sake of simplicity. If a batch has three samples and the activations of a certain layer are equal to the (3, 2)-shaped matrix
>>> X = np.arange(0, 6).reshape(2, 3)
>>> X
array([[0, 1],
[2, 3],
[4, 5]])
then I would like the activations in the next layer to be equal to a degree-2 polynomial feature expansion of X
:
>>> from sklearn.preprocessing import PolynomialFeatures
>>> PolynomialFeatures(degree=2).fit_transform(X)
array([[ 1., 0., 1., 0., 0., 1.],
[ 1., 2., 3., 4., 6., 9.],
[ 1., 4., 5., 16., 20., 25.]])
That is, if the activations of layer i are the matrix X
(of shape (batch_size, num_features)
), then for the parameter choice degree=2
I would like the activations of layer i + 1 to be a concatenation of
batch_size
many 1.
's,X
itself,X
: X[:, 0] * X[:, 0]
, X[:, 0] * X[:, 1]
, and X[:, 1] * X[:, 1]
.My closest solution so far is to concatenate some powers of X
:
import keras.backend as K
X = K.reshape(K.arange(0, 6), (3, 2))
with K.get_session().as_default():
print(K.concatenate([K.pow(X, 0), K.pow(X, 1), K.pow(X, 2)]).eval())
Output:
[[ 1 1 0 1 0 1]
[ 1 1 2 3 4 9]
[ 1 1 4 5 16 25]]
i.e., a concatenation of two columns of 1
s (one more than I'd like, but I can live with this duplication), X
itself, and X
squared element-wise.
Is there a way to compute products of different columns (in an automatically differentiable way)? The step of PolynomialFeatures that I cannot figure out how to implement in tensorflow is to fill in a column of a matrix with the product (across axis=1
) of certain columns of another matrix: XP[:, i] = X[:, c].prod(axis=1)
, where c
is a tuple of indices such as (0, 0, 1)
.
答案 0 :(得分:0)
如果您拥有一个包含所有基本特征的向量和一个常数1,并且使该向量与自身相乘,该怎么办? Outer product in tensorflow
要获得更高的功效,我想您可以再次使用具有相同矢量的外积。
输出的尺寸将非常快地增长。为有限数量的多项式(或普通乘积(xi ^ wi))特征提供可训练的尺寸选择器,是否可以替代?对于某些应用,Deepmind的NALU单元可能会有用。他们能够学习加权加法和加权乘法(正数)。
更新:提取有限数量的多项式特征的另一种方法是堆叠形式为f(PI(wij * xj + bij))的乘法层(单位激活为f),如Yadav, Kalra & John (2006)所述并由我自己here实施(尚未经过全面测试)。