Question

我对神经网络的开发还很陌生，并且一直在尝试构建基本的LSTM分类器，但遇到了错误：“日志和标签必须具有相同的第一维，其日志形状[125,3]和标签形状[25]”。

我使用了相同的数据转换函数和数据形状，并且将几乎相同的神经网络体系结构应用于回归LSTM，但没有遇到这个问题。架构上的主要区别是回归模型的输出是Dense，输出为1。

这是我的代码示例，从数据预处理，转换到模型的体系结构和拟合。

#scale the input variables
sc = MinMaxScaler(feature_range=(0,1))
#get the data examples
inputs = sc.fit_transform(future[['close', 'bullprice', 'bearprice', 'RSI', 'bullrsi', 
                                  'bearrsi', 'bulldiv', 'beardiv', 'com_hedgers', 
                                  'bullhedge', 'bearhedge']])
labels = future[['indicator']].values
data = np.concatenate((inputs, labels), axis=1)
data = data[14:] #get rid of first 14 rows because of RSI 

#function to convert examples into LSTM input/X (needs to be 3D matrix) and labels (output/Y)
def create_dataset(dataset, look_back): #changed lookback here
    dataX, dataY = [], []
    for i in range(len(dataset) - look_back - 1):
        a = dataset[i:(i + look_back), :-1]
        dataX.append(a)
        dataY.append(dataset[i + look_back, -1])
    return np.array(dataX), np.array(dataY)
look_back = 5 #looking back n days to make a the next days prediction, UNDERSTAND THIS AND EXPLAIN CONCISELY
num_features = 11 #place the number of features that are going in to the model

#apply function to the data
X, Y = create_dataset(data, look_back) 

#split into train and test set data
trainX, testX = np.split(X, [int(.7*len(X))])
trainY, testY = np.split(Y, [int(.7*len(Y))])
trainY = trainY+1
testY = testY+1

# one-hot encode the outputs (sell, hold, buy) DOES NOT SEEM TO WORK FOR THE MODEL SO I COMMENTED IT OUT
#onehot_encoder = OneHotEncoder(categories='auto', sparse=False) #'categories=auto' allows the negative categories to work
#trainY = onehot_encoder.fit_transform(trainY.reshape([-1, 1])) #one-hot encoded training set
#testY = onehot_encoder.fit_transform(testY.reshape([-1, 1])) #one-hot encoded test set

'''
Building and fitting the deep learning model
'''
model = tf.keras.Sequential([
    tf.keras.layers.LSTM(trainX.shape[0], input_shape=(look_back,num_features), return_sequences=True), #gives 128 units and outputs them to the next layer
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.LSTM(128, return_sequences=True),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(3,  activation=tf.nn.softmax) #softmax gives us a probability distribution
])
model.compile(optimizer='RMSProp', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
history = model.fit(trainX, trainY, validation_split=0.2, epochs=3, batch_size=25) 
#the first dimension of logits shape is a function of the size of the batch_size and look_back
#something is wrong with the shaping of the data

我的问题与logits输出的第一维是“ look_back”变量（在代码的第一部分）和模型拟合中的batch_size的乘积有关。这是所有基于矩阵的维度-只是不知道如何确定修正。

我希望模型能够正常运行，然后能够输出概率分布以预测这三个类别。

感谢您的宝贵时间。请让我知道您可能需要什么澄清和进一步的信息。

Tensorflow中的logits和标签

0 个答案: