我已经使用BasicRNN构建了一个RNN,现在我想使用LSTMCell,但这段话似乎并不重要。我应该改变什么?
首先我定义所有占位符和变量:
X_placeholder = tf.placeholder(tf.float32, [batch_size, truncated_backprop_length, embedding_size])
Y_placeholder = tf.placeholder(tf.int32, [batch_size, truncated_backprop_length])
init_state = tf.placeholder(tf.float32, [batch_size, state_size])
W = tf.Variable(np.random.rand(state_size, num_classes),dtype=tf.float32)
b = tf.Variable(np.zeros((batch_size, num_classes)), dtype=tf.float32)
W2 = tf.Variable(np.random.rand(state_size, num_classes),dtype=tf.float32)
b2 = tf.Variable(np.zeros((batch_size, num_classes)), dtype=tf.float32)
然后我将标签取出:
labels_series = tf.transpose(batchY_placeholder)
labels_series = tf.unstack(batchY_placeholder, axis=1)
inputs_series = X_placeholder
然后我定义我的RNN:
cell = tf.contrib.rnn.BasicLSTMCell(state_size, state_is_tuple = False)
states_series, current_state = tf.nn.dynamic_rnn(cell, inputs_series, initial_state = init_state)
我得到的错误是:
InvalidArgumentError Traceback (most recent call last)
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
669 node_def_str, input_shapes, input_tensors, input_tensors_as_shapes,
--> 670 status)
671 except errors.InvalidArgumentError as err:
/home/deepnlp2017/anaconda3/lib/python3.5/contextlib.py in __exit__(self, type, value, traceback)
65 try:
---> 66 next(self.gen)
67 except StopIteration:
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
468 compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 469 pywrap_tensorflow.TF_GetCode(status))
470 finally:
InvalidArgumentError: Dimensions must be equal, but are 50 and 100 for 'rnn/while/basic_lstm_cell/mul' (op: 'Mul') with input shapes: [32,50], [32,100].
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-19-2ac617f4dde4> in <module>()
4 #cell = tf.contrib.rnn.BasicRNNCell(state_size)
5 cell = tf.contrib.rnn.BasicLSTMCell(state_size, state_is_tuple = False)
----> 6 states_series, current_state = tf.nn.dynamic_rnn(cell, inputs_series, initial_state = init_state)
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py in dynamic_rnn(cell, inputs, sequence_length, initial_state, dtype, parallel_iterations, swap_memory, time_major, scope)
543 swap_memory=swap_memory,
544 sequence_length=sequence_length,
--> 545 dtype=dtype)
546
547 # Outputs of _dynamic_rnn_loop are always shaped [time, batch, depth].
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py in _dynamic_rnn_loop(cell, inputs, initial_state, parallel_iterations, swap_memory, sequence_length, dtype)
710 loop_vars=(time, output_ta, state),
711 parallel_iterations=parallel_iterations,
--> 712 swap_memory=swap_memory)
713
714 # Unpack final output if not using output tuples.
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name)
2624 context = WhileContext(parallel_iterations, back_prop, swap_memory, name)
2625 ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, context)
-> 2626 result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
2627 return result
2628
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py in BuildLoop(self, pred, body, loop_vars, shape_invariants)
2457 self.Enter()
2458 original_body_result, exit_vars = self._BuildLoop(
-> 2459 pred, body, original_loop_vars, loop_vars, shape_invariants)
2460 finally:
2461 self.Exit()
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants)
2407 structure=original_loop_vars,
2408 flat_sequence=vars_for_body_with_tensor_arrays)
-> 2409 body_result = body(*packed_vars_for_body)
2410 if not nest.is_sequence(body_result):
2411 body_result = [body_result]
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py in _time_step(time, output_ta_t, state)
695 skip_conditionals=True)
696 else:
--> 697 (output, new_state) = call_cell()
698
699 # Pack state if using state tuples
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py in <lambda>()
681
682 input_t = nest.pack_sequence_as(structure=inputs, flat_sequence=input_t)
--> 683 call_cell = lambda: cell(input_t, state)
684
685 if sequence_length is not None:
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in __call__(self, inputs, state, scope)
182 i, j, f, o = array_ops.split(value=concat, num_or_size_splits=4, axis=1)
183
--> 184 new_c = (c * sigmoid(f + self._forget_bias) + sigmoid(i) *
185 self._activation(j))
186 new_h = self._activation(new_c) * sigmoid(o)
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py in binary_op_wrapper(x, y)
882 if not isinstance(y, sparse_tensor.SparseTensor):
883 y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
--> 884 return func(x, y, name=name)
885
886 def binary_op_wrapper_sparse(sp_x, y):
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py in _mul_dispatch(x, y, name)
1103 is_tensor_y = isinstance(y, ops.Tensor)
1104 if is_tensor_y:
-> 1105 return gen_math_ops._mul(x, y, name=name)
1106 else:
1107 assert isinstance(y, sparse_tensor.SparseTensor) # Case: Dense * Sparse.
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py in _mul(x, y, name)
1623 A `Tensor`. Has the same type as `x`.
1624 """
-> 1625 result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
1626 return result
1627
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py in apply_op(self, op_type_name, name, **keywords)
761 op = g.create_op(op_type_name, inputs, output_types, name=scope,
762 input_types=input_types, attrs=attr_protos,
--> 763 op_def=op_def)
764 if output_structure:
765 outputs = op.outputs
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
2395 original_op=self._default_original_op, op_def=op_def)
2396 if compute_shapes:
-> 2397 set_shapes_for_outputs(ret)
2398 self._add_op(ret)
2399 self._record_op_seen_by_control_dependencies(ret)
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in set_shapes_for_outputs(op)
1755 shape_func = _call_cpp_shape_fn_and_require_op
1756
-> 1757 shapes = shape_func(op)
1758 if shapes is None:
1759 raise RuntimeError(
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in call_with_requiring(op)
1705
1706 def call_with_requiring(op):
-> 1707 return call_cpp_shape_fn(op, require_shape_fn=True)
1708
1709 _call_cpp_shape_fn_and_require_op = call_with_requiring
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in call_cpp_shape_fn(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
608 res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
609 input_tensors_as_shapes_needed,
--> 610 debug_python_shape_fn, require_shape_fn)
611 if not isinstance(res, dict):
612 # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).
/home/deepnlp2017/.local/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, debug_python_shape_fn, require_shape_fn)
673 missing_shape_fn = True
674 else:
--> 675 raise ValueError(err.message)
676
677 if missing_shape_fn:
ValueError: Dimensions must be equal, but are 50 and 100 for 'rnn/while/basic_lstm_cell/mul' (op: 'Mul') with input shapes: [32,50], [32,100].
答案 0 :(得分:4)
您应该考虑提供错误跟踪。否则很难(或不可能)提供帮助。
我重现了这种情况,发现问题来自州解包,即行c, h = state
。
尝试将state_is_tuple
设为假,即
cell = tf.contrib.rnn.BasicLSTMCell(state_size, state_is_tuple=False)
我不确定为什么会这样。你在加载以前的型号吗?你的张量流版本是什么?
有关TensorFlow RNN细胞的更多信息:
我建议你看一下:WildML post,“ RNN CELLS,WRAPPERS AND MULTI-LAYER RNNS ”部分。
它声明:
- BasicRNNCell - 一个香草RNN细胞。
- GRUCell - 门控递归单元格。
- BasicLSTMCell - 基于递归神经网络正则化的LSTM单元。没有窥视孔连接或细胞剪裁。
- LSTMCell - 更复杂的LSTM单元,允许可选的窥孔连接和单元限幅。
- MultiRNNCell - 将多个单元组合成多层单元的包装器。
- DropoutWrapper - 一个将dropout添加到单元格的输入和/或输出连接的包装器。
鉴于此,我建议您从BasicRNNCell
切换到BasicLSTMCell
。 Basic
这里的意思是“除非你知道你在做什么,否则使用它”。如果你想尝试 LSTM而不需要详细说明,那就是你要走的路。它可能很简单,只需用它代替即可!
如果没有,请分享一些代码+错误。
希望有所帮助
答案 1 :(得分:0)
问题似乎与init_state
变量有关。
基本RNN单元只有一个状态变量,而LSTM具有可见和隐藏状态。指定选项state_is_tuple=False
将两个状态变量连接成一个,因此在init_state声明中指定的大小加倍。
为了避免这种情况,可以使用内置的zero_state
方法为LSTMCell以正确的方式初始化状态,而不必担心大小差异。
所以它只是:
init_state = cell.zero_state(batch_size, dtype)
当然,必须放在构建单元格的行之后。