理解Keras模型体系结构(嵌套模型的节点索引)

时间:2017-09-02 08:03:18

标签: deep-learning keras

此脚本使用小型嵌套模型定义虚拟模型

from keras.layers import Input, Dense
from keras.models import Model
import keras

input_inner = Input(shape=(4,), name='input_inner')
output_inner = Dense(3, name='inner_dense')(input_inner)
inner_model = Model(inputs=input_inner, outputs=output_inner)

input = Input(shape=(5,), name='input')
x = Dense(4, name='dense_1')(input)
x = inner_model(x)
x = Dense(2, name='dense_2')(x)

output = keras.layers.concatenate([x, x], name='concat_1')
model = Model(inputs=input, outputs=output)

print(model.summary())

产生以下输出

Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input (InputLayer)               (None, 5)             0                                            
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 4)             24          input[0][0]                      
____________________________________________________________________________________________________
model_1 (Model)                  (None, 3)             15          dense_1[0][0]                    
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 2)             8           model_1[1][0]                    
____________________________________________________________________________________________________
concat_1 (Concatenate)           (None, 4)             0           dense_2[0][0]                    
                                                                   dense_2[0][0]                    

我的问题涉及Connected to列的内容。 我理解a layer can have multiple nodes

此列的表示法是layer_name[node_index][tensor_index]

如果我们将inner_model视为一个图层,我希望它只有一个节点,所以我希望dense_2能够连接到model_1[0][0]。但实际上它与model_1[1][0]有关。为什么会这样?

3 个答案:

答案 0 :(得分:5)

<强> 1.背景

当你说:

  

如果我们将inner_model视为一个层,我希望它只有一个   节点

这是正确的,因为它只有一个节点网络的一部分

考虑model.summary函数的github repository。打印连接的函数是print_layer_summary_with_connections(第76行),它只考虑来自relevant_nodes数组的节点。不在此阵列中的所有节点都被视为不属于网络,因此该功能会跳过它们。相关的行是第88-90行:

if relevant_nodes and node not in relevant_nodes:
    # node is not part of the current network
    continue

2.您的模特

现在让我们看看您的特定型号会发生什么。首先让我们定义relevant_nodes

relevant_nodes = []
for v in model.nodes_by_depth.values():
    relevant_nodes += v

数组relevant_nodes如下所示:

[<keras.engine.topology.Node at 0x9dfa518>,
 <keras.engine.topology.Node at 0x9dfa278>,
 <keras.engine.topology.Node at 0x9d8bac8>,
 <keras.engine.topology.Node at 0x9d8ba58>,
 <keras.engine.topology.Node at 0x9d74518>]

但是,当我们在每一层打印入站节点时,我们将得到:

for i in model.layers:
    print(i.inbound_nodes)

[<keras.engine.topology.Node object at 0x0000000009D74518>]
[<keras.engine.topology.Node object at 0x0000000009D8BA58>]
[<keras.engine.topology.Node object at 0x0000000009D743C8>, <keras.engine.topology.Node object at 0x0000000009D8BAC8>]
[<keras.engine.topology.Node object at 0x0000000009DFA278>]
[<keras.engine.topology.Node object at 0x0000000009DFA518>]

您可以看到上面列表中只有一个节点没有出现在relevant_nodes中。这是第三个数组中位置0的节点:

<keras.engine.topology.Node object at 0x0000000009D743C8>

它不被视为模型的一部分,因此没有出现在relevant_nodes中。此数组中位置1的节点确实显示在relevant_nodes中,这就是您将其视为model_1[1][0]的原因。

3.原因

原因基本上就是x=inner_model(input)行。即使你运行的小型号,如下所示:

input_inner = Input(shape=(4,), name='input_inner')
output_inner = Dense(3, name='inner_dense')(input_inner)
inner_model = Model(inputs=input_inner, outputs=output_inner)

input = Input(shape=(5,), name='input')
output = inner_model(input)

model = Model(inputs=input, outputs=output)

您会看到relevant_nodes包含两个元素,而通过

for i in model.layers:
        print(i.inbound_nodes)

你将获得三个节点。

这是因为第1层(上面较小的模型)有两个节点,但只有第二个节点被认为是模型的一部分。特别是,如果您使用layer.get_input_at(node_index)在第1层的每个节点上打印输入,您将获得:

print(model.layers[1].get_input_at(0))
print(model.layers[1].get_input_at(1))

#prints
/input_inner
/input

4.评论中的问题解答

  

1)您是否也知道这个不相关的节点有什么用处/它在哪里   来自?

这个节点似乎是一个内部节点&#34;在inner_model的申请期间创建。特别是,如果您在三个节点中的每一个上打印输入和输出形状(在上面的小模型中),您将得到:

nodes=[model.layers[0].inbound_nodes[0],model.layers[1].inbound_nodes[0],model.layers[1].inbound_nodes[1]]
for i in nodes:
    print(i.input_shapes)
    print(i.output_shapes)
    print(" ")

#prints
[(None, 5)]
[(None, 5)]

[(None, 4)]
[(None, 3)]

[(None, 5)]
[(None, 3)]

因此您可以看到中间节点的形状(未显示在相关节点列表中的形状)对应于inner_model中的形状。

  

2)具有n个输出节点的内部模型是否总是将它们与节点一起呈现   索引1到n而不是0到n-1?

我不确定是否总是如此,因为我猜有几种输出节点节点存在各种可能性,但如果我考虑以下上述小模型的非常自然的概括,情况确实如此:

input_inner = Input(shape=(4,), name='input_inner')
output_inner = Dense(3, name='inner_dense')(input_inner)
inner_model = Model(inputs=input_inner, outputs=output_inner)

input = Input(shape=(5,), name='input')
output = inner_model(input)
output = inner_model(output)

model = Model(inputs=input, outputs=output)

print(model.summary())

这里我刚刚将output = inner_model(output)添加到小型模型中。相关节点列表是

[<keras.engine.topology.Node at 0xd10c390>,
 <keras.engine.topology.Node at 0xd10c9b0>,
 <keras.engine.topology.Node at 0xd10ca20>]

并且所有入站节点的列表都是

[<keras.engine.topology.Node object at 0x000000000D10CA20>]
[<keras.engine.topology.Node object at 0x000000000D10C588>, <keras.engine.topology.Node object at 0x000000000D10C9B0>, <keras.engine.topology.Node object at 0x000000000D10C390>]

确实,节点索引是1和2,正如您在评论中提到的那样。如果我添加另一个output = inner_model(output),节点索引为1,2,3等等,它将继续类似。

答案 1 :(得分:1)

于2020年9月更新。所选答案有些过时(链接未指向正确的位置),并且未完全回答问题:model_1[1][0]。在这种情况下,为什么1中的[1][0]是?这就是我发现的东西。

我使用的代码如下(我为图层添加了一些名称以便于阅读)。您可以复制并运行以查看输出信息。

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

input_inner = layers.Input(shape=(4,), name='inn_input')
output_inner = layers.Dense(3, name='inn_dense')(input_inner)
inner_model = keras.Model(inputs=input_inner, outputs=output_inner,name='inn_model')

inn_allLayers = inner_model.layers
# print(type(inn_allLayers))
print(inner_model.name,': total layer number:',len(inn_allLayers))
for i in inn_allLayers:
    print(i.name, i)
    print(len(i._inbound_nodes))
    for n in i._inbound_nodes:
        print(n.get_config())
        print(n)
    print('===================')
print('************************************************')

nest_input = layers.Input(shape=(5,), name='nest_input')
nest_d1_out = layers.Dense(4, name='nest_dense_1')(nest_input)
nest_m_out = inner_model(nest_d1_out)
nest_d2_out = layers.Dense(2, name='nest_dense_2')(nest_m_out)

nest_add_out = layers.concatenate([nest_d2_out, nest_d2_out], name='nest_concat')
model = keras.Model(inputs=nest_input, outputs=nest_add_out,name='nest_model')

inn_allLayers = inner_model.layers
# print(type(inn_allLayers))
print(inner_model.name,': total layer number:',len(inn_allLayers))
for i in inn_allLayers:
    print(i.name, i)
    print(len(i._inbound_nodes))
    for n in i._inbound_nodes:
        print(n.get_config())
        print(n)
    print('===================')
print('************************************************')

allLayers = model.layers
# print(type(allLayers))
print(model.name,': total layer number:',len(allLayers))
for i in allLayers:
    print(i.name, i)
    print(len(i._inbound_nodes))
    for n in i._inbound_nodes:
        print(n.get_config())
        print(n)
    print('===================')

for op in tf.get_default_graph().get_operations():
    print(str(op.name))

1。 [1][0]代表[node_index][tensor_index]

2。什么是node_index?

tensorflow/python/keras/engine/base_layer.py下,在此类中进行了描述:

class KerasHistory(
    collections.namedtuple('KerasHistory',
                           ['layer', 'node_index', 'tensor_index'])):
  """Tracks the Layer call that created a Tensor, for Keras Graph Networks.

  During construction of Keras Graph Networks, this metadata is added to
  each Tensor produced as the output of a Layer, starting with an
  `InputLayer`. This allows Keras to track how each Tensor was produced, and
  this information is later retraced by the `keras.engine.Network` class to
  reconstruct the Keras Graph Network.

  Attributes:
    layer: The Layer that produced the Tensor.
    node_index: The specific call to the Layer that produced this Tensor. Layers
      can be called multiple times in order to share weights. A new node is
      created every time a Tensor is called.
    tensor_index: The output index for this Tensor. Always zero if the Layer
      that produced this Tensor only has one output. Nested structures of
      Tensors are deterministically assigned an index via `nest.flatten`.
  """
  # Added to maintain memory and performance characteristics of `namedtuple`
  # while subclassing.

它表示每次调用Tensor时都会创建一个Node。对我来说,这有点模糊。我的理解是,当调用一个层时,它会产生一个Tensor,涉及到调用此层的不同方式将创建多个节点(稍后将显示一些打印结果。)

3。如何打印每个节点?

在同一py文件下,有以下代码段:

# Create node, add it to inbound nodes.
    Node(
        self,
        inbound_layers=inbound_layers,
        node_indices=node_indices,
        tensor_indices=tensor_indices,
        input_tensors=input_tensors,
        output_tensors=output_tensors,
        arguments=arguments)

    # Update tensor history metadata.
    # The metadata attribute consists of
    # 1) a layer instance
    # 2) a node index for the layer
    # 3) a tensor index for the node.
    # The allows layer reuse (multiple nodes per layer) and multi-output
    # or multi-input layers (e.g. a layer can return multiple tensors,
    # and each can be sent to a different layer).
    for i, tensor in enumerate(nest.flatten(output_tensors)):
      tensor._keras_history = KerasHistory(self,
                                           len(self._inbound_nodes) - 1, i)

self引用Layer对象。该信息将重新编码在每个张量的_keras_historyself._inbound_nodes属性中。因此,我们可以通过print(layer._inbound_nodes[index_of_node].get_config()准确地打印节点,因为我已经在开始的代码中键入了可运行的代码。

(什么是入站和出站节点?乍看之下很让人困惑,但是如果您想象每个节点都是从一层指向另一层的箭头,则可能会更容易。代码描述如下)

class Node(object):
  """A `Node` describes the connectivity between two layers.

  Each time a layer is connected to some new input,
  a node is added to `layer._inbound_nodes`.
  Each time the output of a layer is used by another layer,
  a node is added to `layer._outbound_nodes`.

  Arguments:
      outbound_layer: the layer that takes
          `input_tensors` and turns them into `output_tensors`
          (the node gets created when the `call`
          method of the layer was called).
      inbound_layers: a list of layers, the same length as `input_tensors`,
          the layers from where `input_tensors` originate.
      node_indices: a list of integers, the same length as `inbound_layers`.
          `node_indices[i]` is the origin node of `input_tensors[i]`
          (necessary since each inbound layer might have several nodes,
          e.g. if the layer is being shared with a different data stream).
      tensor_indices: a list of integers,
          the same length as `inbound_layers`.
          `tensor_indices[i]` is the index of `input_tensors[i]` within the
          output of the inbound layer
          (necessary since each inbound layer might
          have multiple tensor outputs, with each one being
          independently manipulable).
      input_tensors: list of input tensors.
      output_tensors: list of output tensors.
      arguments: dictionary of keyword arguments that were passed to the
          `call` method of the layer at the call that created the node.

  `node_indices` and `tensor_indices` are basically fine-grained coordinates
  describing the origin of the `input_tensors`.

  A node from layer A to layer B is added to:
    - A._outbound_nodes
    - B._inbound_nodes
  """

4。观察节点的创建。

您可能会注意到代码中的inner_model有两个完全相同的打印块:一个在构建嵌套模型之前,一个在之后。

输出如下:

inn_model : total layer number: 2
inn_input <tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7fd1c6755780>
1
{'outbound_layer': 'inn_input', 'inbound_layers': [], 'node_indices': [], 'tensor_indices': []}
<tensorflow.python.keras.engine.base_layer.Node object at 0x7fd1d2e75e10>
===================
inn_dense <tensorflow.python.keras.layers.core.Dense object at 0x7fd1d2e75e80>
1
{'outbound_layer': 'inn_dense', 'inbound_layers': 'inn_input', 'node_indices': 0, 'tensor_indices': 0}
<tensorflow.python.keras.engine.base_layer.Node object at 0x7fd1d2e92550>
===================
************************************************
inn_model : total layer number: 2
inn_input <tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7fd1c6755780>
1
{'outbound_layer': 'inn_input', 'inbound_layers': [], 'node_indices': [], 'tensor_indices': []}
<tensorflow.python.keras.engine.base_layer.Node object at 0x7fd1d2e75e10>
===================
inn_dense <tensorflow.python.keras.layers.core.Dense object at 0x7fd1d2e75e80>
2
{'outbound_layer': 'inn_dense', 'inbound_layers': 'inn_input', 'node_indices': 0, 'tensor_indices': 0}
<tensorflow.python.keras.engine.base_layer.Node object at 0x7fd1d2e92550>
{'outbound_layer': 'inn_dense', 'inbound_layers': 'nest_dense_1', 'node_indices': 0, 'tensor_indices': 0}
<tensorflow.python.keras.engine.base_layer.Node object at 0x7fd1d2ac4358>
===================
************************************************

您将立即注意到,在构建嵌套模型之后,将创建一个额外的(入站)节点(或箭头),该节点指向inn_dense。创建了一个,从inn_input指向inn_dense,创建了另一个,从nest_dense_1指向inn_dense。这就是前面所说的,每次调用图层时,都会创建一个新节点(箭头)。

5。问题已回答

到目前为止,我认为它已经解释了最初的问题:1中为什么[1][0]。这是因为重用inner_model会导致inner_dense层再次用于创建张量。

其余的代码片段还包含一些额外的信息,您可以对其进行检查并获得更好的主意。

答案 2 :(得分:0)

似乎现在是“ _nodes_by_depth”,而不是“ nodes_by_depth”。相同于inbound_nodes等。也许答案必须更新。