Question

我的数据集具有27个特征。我想将其中的26个用作输入，最后一个用作输出（此功能也是数据集中的最后一列）。我使用LSTM多输入多步输出代码。使用“ def split_sequences（sequences，n_steps_in，n_steps_out）”函数后，我将数据集分为训练集和测试集，并为n_steps_in和n_steps out选择了多个时间步长。在使用“ split_sequences（train，n_steps_in，n_steps_out）”从2D转换为3D之后，我打印了train_X，train_y，test_X和test_y的形状。结果是：（14476887，25，26）（14476887，20）（7130386，25，26）（7130386，20）

我的目标是将前26列（功能）作为输入，而将最后26列（功能第27列）作为输出。但是在我的代码中，没有命令将数据集分为输入和输出。

我的问题是：

1。）python是从0开始向上计数，所以0是我的第一个功能还是从1开始计数？

2。）python是否从左到右起作用，以便csv中的左功能文件是我的第一个功能，等等吗？

3。）上方（7130386，20）的形状等于（7130386，20，1）还是为什么？ 2D？

4。）如何在代码中显式指定我的26个输入和1个输出？

5。）模型从哪里获得代码中显示的26个功能的信息在下面？

我希望我能很好地解释我的问题和问题。

我是LSTM的新手，不知道如何处理。我非常绝望，因为我是一名博士生，并且为我的论文做了很多。现在是时候建立预测模型了，并且无法取得进展。

我希望你们中的一个能帮助我。

非常感谢。

阿里

将多元序列拆分为样本

    def split_sequences(sequences, n_steps_in, n_steps_out):

    X, y = list(), list()
    for i in range(len(sequences)):
         # find the end of this pattern
         end_ix = i + n_steps_in
         out_end_ix = end_ix + n_steps_out-1
         # check if we are beyond the dataset
         if out_end_ix > len(sequences):
            break
         # gather input and output parts of the pattern
         seq_x, seq_y = sequences[i:end_ix, :-1], sequences   
                 [end_ix-1:out_end_ix, -1]
         X.append(seq_x)
         y.append(seq_y)
     return array(X), array(y)

分为训练和测试集

     train_size = int(len(values)*0.67)
     test_size = len(values) - train_size
     train, test = values[0:train_size,:], values[train_size:len(values),:]
     print(len(trian, len(test))

14476930 7130429

选择一些时间步长

      n_steps_in, n_steps_out = 25, 20


      train_X, train_y = split_sequences(train, n_steps_in, n_steps_out)
      test_X, test_y = split_sequences(test, n_steps_in, n_steps_out)
      print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

（14476887，25，26）（14476887，20）（7130386，25，26）（7130386，20）

不知道def def split_sequences函数实际上是做什么的

0 个答案: