Question

我遇到了一个平均值包含填充值的情况。给定某个形状为X的张量(batch_size, ..., features)，可能会有零填充特征才能获得相同的形状。

如何平均X（要素）的最终尺寸，但只能是非零的条目？因此，我们将总和除以非零条目的数量。

示例输入：

x = [[[[1,2,3], [2,3,4], [0,0,0]],
       [[1,2,3], [2,0,4], [3,4,5]],
       [[1,2,3], [0,0,0], [0,0,0]],
       [[1,2,3], [1,2,3], [0,0,0]]],
      [[[1,2,3], [0,1,0], [0,0,0]],
       [[1,2,3], [2,3,4], [0,0,0]],                                                         
       [[1,2,3], [0,0,0], [0,0,0]],                                                         
       [[1,2,3], [1,2,3], [1,2,3]]]]
# Desired output
y = [[[1.5 2.5 3.5]
      [2.  2.  4. ]
      [1.  2.  3. ]
      [1.  2.  3. ]]
     [[0.5 1.5 1.5]
      [1.5 2.5 3.5]
      [1.  2.  3. ]
      [1.  2.  3. ]]]

Answer 1

纯Keras解决方案计算非零项的数量，然后相应地求和。这是一个自定义图层：

import keras.layers as L
import keras.backend as K

class NonZeroMean(L.Layer):
  """Compute mean of non-zero entries."""
  def call(self, x): 
    """Calculate non-zero mean."""
    # count the number of nonzero features, last axis
    nonzero = K.any(K.not_equal(x, 0.0), axis=-1)
    n = K.sum(K.cast(nonzero, 'float32'), axis=-1, keepdims=True)
    x_mean = K.sum(x, axis=-2) / n
    return x_mean

  def compute_output_shape(self, input_shape):
    """Collapse summation axis."""
    return input_shape[:-2] + (input_shape[-1],)

我想需要添加一个条件来检查所有特征是否都为零并返回零，否则我们将得到除以零的误差。当前示例经过以下测试：

# Dummy data
x = [[[[1,2,3], [2,3,4], [0,0,0]],
      [[1,2,3], [2,0,4], [3,4,5]],
      [[1,2,3], [0,0,0], [0,0,0]],
      [[1,2,3], [1,2,3], [0,0,0]]],
     [[[1,2,3], [0,1,0], [0,0,0]],
      [[1,2,3], [2,3,4], [0,0,0]],
      [[1,2,3], [0,0,0], [0,0,0]],
      [[1,2,3], [1,2,3], [1,2,3]]]]
x = np.array(x, dtype='float32')

# Example run
x_input = K.placeholder(shape=x.shape, name='x_input')
out = NonZeroMean()(x_input)
s = K.get_session()
print("INPUT:", x)
print("OUTPUT:", s.run(out, feed_dict={x_input: x}))

如何在张量中平均非零条目？

1 个答案: