我想将嵌入层的输出作为LSTM层的初始状态。当然,仅batch_size = 1
就是这种情况,因为我知道LSTM状态的大小为[batch_size, num_units]
,其中num_units
是LSTM中神经元的数量。此外,我使用embedding_size = 1
。
对于这个最小的示例,我有1000个观察值,20个时间步长和3个数字特征,每个观察值(行)都由customer_id
表示,在这种情况下,我有3个客户。因此,当我训练一个示例时,我将使用嵌入层的输出(其大小为[1, 1, 1]
)。但是,我首先对其进行重塑,然后对其进行平铺,以使其在第二维中的大小为num_units
。
这是一个最小的例子
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
print(tf.__version__)
# 2.00
num_obs = 1000
n_steps = 20
n_numerical_feats = 3
cat_size = 15
embedding_size = 1
num_units = 64
target = np.random.random(size=(num_obs,1))
print(target.shape)
#(1000, 1)
#3 numerical variable
num_data = np.random.random(size=(num_obs*n_steps,n_numerical_feats))
print(num_data.shape)
#(20000, 1)
#Reshaping numeric features to fit into an LSTM network
X_numeric = num_data.reshape(-1,n_steps,n_numerical_feats)
print(X_numeric.shape)
#(1000, 20, 1)
unique_customer_ids = 3
customer_id = np.random.randint(0, unique_customer_ids, num_obs).reshape(-1,1)
print(customer_id.shape)
#(1000, 1)
customer_id_input = keras.layers.Input(shape=(1,), name='customer_id_input')
#<tf.Tensor 'cat_input:0' shape=(None, 1) dtype=float32>
customer_id_embedded = keras.layers.Embedding(input_dim=unique_customer_ids, output_dim = 1, embeddings_initializer='uniform')(customer_id_input)
#<tf.Tensor 'embedding/Identity:0' shape=(None, 1, 1) dtype=float32>
customer_id_embedded_reshape = tf.squeeze(customer_id_embedded, [1])
# <tf.Tensor 'Squeeze:0' shape=(None, 1) dtype=float32>
state = tf.tile(customer_id_embedded_reshape, [-1, num_units])
#<tf.Tensor 'Tile:0' shape=(None, 64) dtype=float32>
numerical_inputs = keras.layers.Input(shape=(n_steps, n_numerical_feats), name='numerical_inputs')
#<tf.Tensor 'numeri cal_inputs:0' shape=(None, 20, 1) dtype=float32>
lstm_out, state_h, state_c = keras.layers.LSTM(units = num_units, return_sequences=False, return_state=True,kernel_initializer='glorot_uniform', recurrent_initializer='glorot_uniform', bias_initializer='zeros')(numerical_inputs, initial_state=[state, state])
#<tf.Tensor 'lstm/strided_slice_7:0' shape=(?, 64) dtype=float32>
Dense_layer1 = keras.layers.Dense(32, activation='relu', use_bias=True)(lstm_out)
Dense_layer2 = keras.layers.Dense(1, activation='linear', use_bias=True)(Dense_layer1 )
model = keras.models.Model(inputs=[numerical_inputs] + [customer_id_input], outputs=Dense_layer2)
model.summary()
keras.utils.plot_model(model, to_file='model.png', show_shapes = True, show_layer_names = True)
#compile model
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss='mse',
optimizer=optimizer,
metrics=['mae', 'mse'])
EPOCHS = 1000
#fit the model
#you can use input layer names instead
history = model.fit({'numerical_inputs': X_numeric,
'customer_id_input': customer_id},
y = target,
batch_size=1,
epochs=EPOCHS,
verbose=1,
initial_epoch=0)
但是我得到了这个错误,我什么也找不到。似乎一切都很好,只是无法弄清楚我哪里出错了。