Question

假设我们正在尝试根据以前的地震来预测未来的地震。包含地震序列的数据集的一部分的可视化效果可能看起来像这样（特征在左边，标签在右边，x =月，y =地震震级）：您和我可以看到，这里的模式是，当连续发生两次5级地震时，依次是8级地震和4级地震。这些地震可能会出现3-5个月的变化，但震级相同。

使用均方误差损失函数对该数据进行LSTM训练会产生以下预测（虚线）：完全按照我们的要求完成了，学会了预测每个时间步的均值。但是，就模型的意图而言，这是一个糟糕的预测。

对于您或我查看的数据，我们可以推断出第一个预测的地震是8级地震，考虑到时间步长的差异，我们最好的预测是它发生在时间步长8。在时间轴和值轴上都存在平均误差。

在TensorFlow中可以使用什么损失函数？

我一直在搜索，但只查找与每个时间步错误有关的材料。我开始尝试一些我认为可能会在执行均方误差之前在标签和预测上产生更好结果的方法，例如tf.sort，然后在标签和预测上进行tf.argsort并对其进行均方误差给出时间误差，但是argsort没有用于反向传播的渐变，因此开始尝试编写自定义渐变，当解决方案必须已经存在时，这感觉就像是过分杀伤了？（ps。我对TensorFlow感到满意，但我不是专家）。

如果有人可以指出正确的方向，我将非常感激！

下面的代码演示了上面的示例：

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

features = np.array([[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 
                     [0.0, 0.0, 0.5, 0.0, 0.5, 0.0, 0.0], 
                     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 
                     [0.0, 0.0, 0.5, 0.0, 0.5, 0.0, 0.0], 
                     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 
                     [0.0, 0.0, 0.5, 0.0, 0.5, 0.0, 0.0]]).reshape(6, 7, 1).astype(np.float32)
labels = np.array([[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
                   [0.8, 0.0, 0.0, 0.4, 0.0, 0.0, 0.0],
                   [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
                   [0.0, 0.0, 0.8, 0.0, 0.0, 0.4, 0.0],
                   [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
                   [0.0, 0.8, 0.0, 0.0, 0.4, 0.0, 0.0]]).reshape(6, 7, 1).astype(np.float32)

plt.figure(figsize=(20, 2))
plt.plot(np.concatenate((features[:, :, 0].T, labels[:, :, 0].T)) * 10)
plt.axvline(features.shape[1]-1)
plt.title('<- Features vs. Labels ->')
plt.xlabel('Month')
plt.ylabel('Magnitude')
plt.show()

def train_input_fn(features, labels):
    dataset = tf.data.Dataset.from_tensor_slices((features, labels))
    return dataset.shuffle(len(features)).repeat().batch(len(features))
def predict_input_fn(features):
    dataset = tf.data.Dataset.from_tensor_slices(features)
    return dataset.batch(len(features))

def model_fn(features, labels, mode):
    lstm = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=100, return_sequences=True, input_shape=(7, 1)))(features)
    outputs = tf.keras.layers.Dense(units=1, activation='linear')(lstm)

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=outputs)

    loss = tf.losses.mean_squared_error(labels, outputs)
    optimiser = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = optimiser.minimize(loss, global_step=tf.train.get_global_step())
    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

estimator = tf.estimator.Estimator(model_fn=model_fn)
estimator.train(input_fn=lambda:train_input_fn(features, labels), steps=100)
predictions = np.array(list(estimator.predict(input_fn=lambda:predict_input_fn(features))))

plt.figure(figsize=(20, 2))
plt.plot(np.concatenate((features[:, :, 0].T, labels[:, :, 0].T)) * 10)
plt.plot(np.concatenate((features[:, :, 0].T, predictions[:, :, 0].T)) * 10, ls='--')
plt.axvline(features.shape[1]-1)
plt.title('<- Features vs. Labels -> (with model predictions \'--\')')
plt.xlabel('Month')
plt.ylabel('Magnitude')
plt.show()

TensorFlow神经网络LSTM：时间误差的损失函数

0 个答案: