我是Tensorflow的新手,想了解keras LSTM layer,所以我写了这篇
测试程序以识别stateful
选项的行为。
#Tensorflow 1.x version
import tensorflow as tf
import numpy as np
NUM_UNITS=1
NUM_TIME_STEPS=5
NUM_FEATURES=1
BATCH_SIZE=4
STATEFUL=True
STATEFUL_BETWEEN_BATCHES=True
lstm = tf.keras.layers.LSTM(units=NUM_UNITS, stateful=STATEFUL,
return_state=True, return_sequences=True,
batch_input_shape=(BATCH_SIZE, NUM_TIME_STEPS, NUM_FEATURES),
kernel_initializer='ones', bias_initializer='ones',
recurrent_initializer='ones')
x = tf.keras.Input((NUM_TIME_STEPS,NUM_FEATURES),batch_size=BATCH_SIZE)
result = lstm(x)
I = tf.compat.v1.global_variables_initializer()
sess = tf.compat.v1.Session()
sess.run(I)
X_input = np.array([[[3.14*(0.01)] for t in range(NUM_TIME_STEPS)] for b in range(BATCH_SIZE)])
feed_dict={x: X_input}
def matprint(run, mat):
print('Batch = ', run)
for b in range(mat.shape[0]):
print('Batch Sample:', b, ', per-timestep output')
print(mat[b].squeeze())
print('BATCH_SIZE = ', BATCH_SIZE, ', T = ', NUM_TIME_STEPS, ', stateful =', STATEFUL)
if STATEFUL:
print('STATEFUL_BETWEEN_BATCHES = ', STATEFUL_BETWEEN_BATCHES)
for r in range(2):
feed_dict={x: X_input}
OUTPUT_NEXTSTATES = sess.run({'result': result}, feed_dict=feed_dict)
OUTPUT = OUTPUT_NEXTSTATES['result'][0]
NEXT_STATES=OUTPUT_NEXTSTATES['result'][1:]
matprint(r,OUTPUT)
if STATEFUL:
if STATEFUL_BETWEEN_BATCHES:
#For TF version 1.x manually re-assigning states from
#the last batch IS required for some reason ...
#seems like a bug
sess.run(lstm.states[0].assign(NEXT_STATES[0]))
sess.run(lstm.states[1].assign(NEXT_STATES[1]))
else:
lstm.reset_states()
请注意,LSTM的权重设置为所有权重,并且输入保持恒定以保持一致性。
如预期的那样,statueful=False
没有样本,时间或批间时脚本的输出
依赖:
BATCH_SIZE = 4 , T = 5 , stateful = False
Batch = 0
Batch Sample: 0 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 1 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 2 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 3 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch = 1
Batch Sample: 0 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 1 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 2 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 3 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
在设置stateful=True
时,我期望每批中的样品以产生不同的输出(
大概是因为TF图保持了批次样品之间的状态)。但是,情况并非如此:
BATCH_SIZE = 4 , T = 5 , stateful = True
STATEFUL_BETWEEN_BATCHES = True
Batch = 0
Batch Sample: 0 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 1 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 2 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 3 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch = 1
Batch Sample: 0 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
Batch Sample: 1 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
Batch Sample: 2 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
Batch Sample: 3 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
尤其要注意,同一批次的前两个样本的输出是相同的。
编辑:OverlordGoldDragon已通知我
这种行为是预期的,而我的困惑在于 Batch 之间的区别-
(samples, timesteps, features)
-和一个批次(或该批次的一个“行”)中的 Sample 。
由下图表示:
因此,这提出了给定批次中各个样本之间的依赖性(如果有)的问题。来自 我的脚本输出,导致我相信每个样本被馈送到(逻辑上)独立的LSTM块-并且 差异样本的LSTM状态是独立的。我在这里画的:
我的理解正确吗?
顺便说一句,似乎stateful=True
在TensorFlow 1.x中是损坏的,因为如果我删除了显式的
上一个批次的状态分配:
sess.run(lstm.states[0].assign(NEXT_STATES[0]))
sess.run(lstm.states[1].assign(NEXT_STATES[1]))
它将停止工作,即第二批的输出与第一批的输出相同。
我用Tensorflow 2.0语法重新编写了上面的脚本,其行为正是我所期望的 (无需在批次之间手动保留LSTM状态):
#Tensorflow 2.0 implementation
import tensorflow as tf
import numpy as np
NUM_UNITS=1
NUM_TIME_STEPS=5
NUM_FEATURES=1
BATCH_SIZE=4
STATEFUL=True
STATEFUL_BETWEEN_BATCHES=True
lstm = tf.keras.layers.LSTM(units=NUM_UNITS, stateful=STATEFUL,
return_state=True, return_sequences=True,
batch_input_shape=(BATCH_SIZE, NUM_TIME_STEPS, NUM_FEATURES),
kernel_initializer='ones', bias_initializer='ones',
recurrent_initializer='ones')
X_input = np.array([[[3.14*(0.01)]
for t in range(NUM_TIME_STEPS)]
for b in range(BATCH_SIZE)])
@tf.function
def forward(x):
return lstm(x)
def matprint(run, mat):
print('Batch = ', run)
for b in range(mat.shape[0]):
print('Batch Sample:', b, ', per-timestep output')
print(mat[b].squeeze())
print('BATCH_SIZE = ', BATCH_SIZE, ', T = ', NUM_TIME_STEPS, ', stateful =', STATEFUL)
if STATEFUL:
print('STATEFUL_BETWEEN_BATCHES = ', STATEFUL_BETWEEN_BATCHES)
for r in range(2):
OUTPUT_NEXTSTATES = forward(X_input)
OUTPUT = OUTPUT_NEXTSTATES[0].numpy()
NEXT_STATES=OUTPUT_NEXTSTATES[1:]
matprint(r,OUTPUT)
if STATEFUL:
if STATEFUL_BETWEEN_BATCHES:
pass
#Explicitly re-assigning states from the last batch isn't
# required as the model maintains inter-batch history.
#This is NOT the same behavior for TF.version < 2.0
#lstm.states[0].assign(NEXT_STATES[0].numpy())
#lstm.states[1].assign(NEXT_STATES[1].numpy())
else:
lstm.reset_states()
这是输出:
BATCH_SIZE = 4 , T = 5 , stateful = True
STATEFUL_BETWEEN_BATCHES = True
Batch = 0
Batch Sample: 0 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 1 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 2 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch Sample: 3 , per-timestep output
[0.38041887 0.663519 0.79821336 0.84627265 0.8617684 ]
Batch = 1
Batch Sample: 0 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
Batch Sample: 1 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
Batch Sample: 2 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
Batch Sample: 3 , per-timestep output
[0.86686385 0.8686781 0.8693927 0.8697042 0.869853 ]
答案 0 :(得分:2)
一切似乎都按预期进行-但需要对代码进行大量修订:
public string ReadFromUserTillTrue(string promptMessage,string errorMessage,Func<string,bool> validator)
{
var input = string.Empty;
while(true)
{
Console.WriteLine(promptMessage);
input = Console.ReadLine();
if(validator(input))
break;
else
Console.WriteLine(errorMessage);
}
return input;
}
应该是Batch: 0
;您的Sample: 0
包含4个样本,5个时间步长和1个功能 / 通道。在您的情况下,batch_shape=(4, 5, 1)
是实际的批次标记I
进行验证print(X_input)
产生相同输出(因为没有保持内部状态)-而stateful=False
即使输入相同(由于内存),也会产生不同stateful=True
的不同输出I
不是学习的,所以权重是相同的-对于相同的输入,所有lstm
的输出都将完全相同