给定非文本顺序数据,用于多类分类的LSTM的y列形状应该是什么?

时间:2019-08-22 13:39:30

标签: keras time-series lstm multiclass-classification

问题描述

我有一个数据集(功能= 175,n_time_steps = 954,序列数= 737)。 第1-174列是要素,最后一个目标列包含3个不同的类。我想将LSTM用于多类分类,以仅预测最后一个时间步,即,使用953个步和特征来预测步骤954的类。我在y_train输入的结构方面苦苦挣扎。对于解决此问题如何正确重塑y_train的任何想法,我将不胜感激。

数据

我有737个产品,每个产品的销售量为954天。目标类别为(不存在产品时为0-,类型A为1-产品,类型B为2产品)。我需要使用953天和174个功能来预测序列最后一天(954)每种产品的类别。测试仪有100个产品,训练仪有-637个产品。

重塑X_train后的形状为(637,9​​53,175)。 y_train的形状为 (637,1)。当我运行to_categorical时,形状为(637,2)。拟合LSTM模型时,两个y_train形状都会引发错误。

当我拟合形状为y_train(637,1)的错误是

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

当我拟合to_categorical(y_train)的形状(637,2)时,错误是

ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)

当我更改为'sparse_categorical_crossentropy'并拟合形状y_train(637,1)时,错误是

InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] 

这是我的模特

model = Sequential([
            LSTM(units=1024, 
            input_shape=(periods_to_train,features), kernel_initializer='he_uniform',
            activation ='linear', kernel_constraint=maxnorm(3), return_sequences=False),
            Dropout(rate=0.5),
            Dense(units=1024,kernel_initializer='he_uniform', 
            activation='linear', kernel_constraint=maxnorm(3)),
            Dropout(rate=0.5),
            Dense(units=1024, kernel_initializer='he_uniform',
            activation='linear', kernel_constraint=maxnorm(3)),
            Dropout(rate=0.5),
            Dense(units=periods_to_predict, kernel_initializer='he_uniform', activation='softmax')])

        #Compile model
optimizer = Adamax(lr=0.001, decay=0.1)

model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

configure(gpu_ind=True)
model.fit(X_train, y_train ,validation_split=0.1, batch_size=100, epochs=8, shuffle=True)

1 个答案:

答案 0 :(得分:0)

您对网络的理解似乎是正确的。因此,我重新创建了一个最小的工作示例,以与您相同的方式生成数据并进行训练。当您将时间步长(periods_to_train)设置为等于953时,我还会遇到一些奇怪的错误。但是,已有多项研究表明,使用LSTM的时间步依赖性不超过200到500,因为模型输出将开始“忘记”较早的信息。

下面是您尝试做的最少工作示例代码,仅使用100个时间步。我的情况下没有错误(tensorflow版本1.14.0):

import tensorflow as tf
import tensorflow.keras.backend as K
import numpy as np


data_size = 637
periods_to_train=100
features = 175
periods_to_predict = 3

X_train=np.random.rand(data_size,periods_to_train,features)
y_train=np.random.randint(0,3,data_size).reshape(-1,1)

K.clear_session()

model = tf.keras.models.Sequential([
            tf.keras.layers.LSTM(
                    units=1024, input_shape=(periods_to_train,features), kernel_initializer='he_uniform',
                    activation ='linear', kernel_constraint=tf.keras.constraints.max_norm(3.), return_sequences=False),
            tf.keras.layers.Dropout(rate=0.5),
            tf.keras.layers.Dense(
                    units=1024,kernel_initializer='he_uniform', 
                    activation='linear', kernel_constraint=tf.keras.constraints.max_norm(3)),
            tf.keras.layers.Dropout(rate=0.5),
            tf.keras.layers.Dense(
                    units=1024, kernel_initializer='he_uniform',
                    activation='linear', kernel_constraint=tf.keras.constraints.max_norm(3)),
            tf.keras.layers.Dropout(rate=0.5),
            tf.keras.layers.Dense(
                    units=periods_to_predict, kernel_initializer='he_uniform', 
                    activation='softmax')])


optimizer = tf.keras.optimizers.Adamax(lr=0.001, decay=0.1)

model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

model.fit(X_train, y_train ,validation_split=0.1, batch_size=64, epochs=1, shuffle=True)