Tensorflow 2.0 / Keras:使用SparseTensor作为Conv2D层的输入

时间:2019-11-25 02:42:17

标签: python tensorflow conv-neural-network tensorflow2.0 tf.keras

我正在尝试使用Tensorflow 2.0的Keras库构建网络,

  1. SparseTensor作为输入,并且
  2. 使用Conv2D作为其第一个隐藏层。

我使用这种架构的动机如下:

  1. 我的输入包含2D地理空间数据的样本。
  2. 因为每个样本使用2个通道,所以我的总体输入张量是4维的(即样本的1维, X Y 的2维,1维频道)。
  3. 每个数据样本太大,无法容纳到所提供的内存中。
  4. 但是,每个样本都非常稀疏,因此通过将输入的整体表示从4D NumPy ndarray切换为4D SparseTensor,我们能够减少需要显式显示的点数在我们的输入张量中定义约99.8%。

不幸的是,当我尝试训练模型时,我一直遇到以下错误:

Traceback (most recent call last):
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\util\nest.py", line 318, in assert_same_structure
    expand_composites)
ValueError: The two structures don't have the same nested structure.

First structure: type=SparseTensorSpec str=SparseTensorSpec(TensorShape([204, 9808, 5077, 2]), tf.float32)

Second structure: type=Tensor str=Tensor("conv2d_input:0", shape=(None, 9808, 5077, 2), dtype=float32)

More specifically: Substructure "type=SparseTensorSpec str=SparseTensorSpec(TensorShape([204, 9808, 5077, 2]), tf.float32)" is a sequence, while substructure "type=Tensor str=Tensor("conv2d_input:0", shape=(None, 9808, 5077, 2), dtype=float32)" is not

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "neural_network.py", line 137, in <module>
    callbacks = [tensorboard_callback, lr_callback])
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 728, in fit
    use_multiprocessing=use_multiprocessing)
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 224, in fit
    distribution_strategy=strategy)
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 547, in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 594, in _process_inputs
    steps=steps)
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 2497, in _standardize_user_data
    nest.assert_same_structure(a, b, expand_composites=True)
  File "C:\path\to\virtualenv\lib\site-packages\tensorflow_core\python\util\nest.py", line 325, in assert_same_structure
    % (str(e), str1, str2))
ValueError: The two structures don't have the same nested structure.

First structure: type=SparseTensorSpec str=SparseTensorSpec(TensorShape([204, 9808, 5077, 2]), tf.float32)

Second structure: type=Tensor str=Tensor("conv2d_input:0", shape=(None, 9808, 5077, 2), dtype=float32)

More specifically: Substructure "type=SparseTensorSpec str=SparseTensorSpec(TensorShape([204, 9808, 5077, 2]), tf.float32)" is a sequence, while substructure "type=Tensor str=Tensor("conv2d_input:0", shape=(None, 9808, 5077, 2), dtype=float32)" is not
Entire first structure:
.
Entire second structure:
.

我的模型是使用以下代码构建和训练的:

def build_model(x_conv_dim, y_conv_dim, channels):
  """
  Building the neural network model (architecture)
  Returns: model -- NN model including the layers, parameter initialization, and activation
  """
  model = tf.keras.models.Sequential([
      tf.keras.layers.Conv2D(
        input_shape = (x_conv_dim, y_conv_dim, channels),
        filters = 1,
        kernel_size = 300,
        strides = 10,
        padding = 'valid',
        activation = 'relu',
        kernel_initializer = 'he_normal',
        bias_initializer = 'zeros'),
      # tf.keras.layers.BatchNormalization(axis = 1, trainable=True, epsilon=BATCH_NORM_EPSILON),
      # tf.keras.layers.Activation('relu'),
      tf.keras.layers.Flatten(),
      tf.keras.layers.Dense(Y_train_orig.shape[0], kernel_initializer='he_normal', bias_initializer='zeros'),
      tf.keras.layers.BatchNormalization(axis = 1, trainable=True, epsilon=BATCH_NORM_EPSILON),
  ])

  model.compile(loss='mae',
      optimizer='adam',
      metrics=['mse'])

  return model

# Each X value is a SparseTensor.
X_train_orig, Y_train_orig, X_dev_orig, Y_dev_orig, X_test_orig, Y_test_orig = utils.load_dataset()

# Building the model
model = build_model(
  x_conv_dim = X_train_orig.shape[1],
  y_conv_dim = X_train_orig.shape[2],
  channels = X_train_orig.shape[3])

# Training model.
history = model.fit(
    x = X_train_orig,
    y = Y_train_orig.T,
    epochs = 100,
    batch_size = 32,
    shuffle = True,
    verbose = 1,
    callbacks = [tensorboard_callback, lr_callback])

任何人都可以推荐一种使该体系结构正常工作的方法,和/或找出我上面方法的错误之处吗?

提前感谢您的帮助!

0 个答案:

没有答案