Question

我正在使用Tensorflow 2.0中的LSTM，并且正在尝试为训练样本分配权重（我已经尝试过使用class_weights字典，但是它抱怨它不支持3维数组。我的数组形状是（26000，7，1））。

按照文档中的建议，我在编译时将sample_weight_mode设置为“ temporal”，但是在此之后，当我尝试拟合模型时，仍然出现以下错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-100-1bb94008291c> in <module>
      1 with tf.device("/device:GPU:0"):
----> 2     model.fit(X, y, epochs=500, batch_size=4096, verbose=1, validation_split=0.2, sample_weight=cw.reshape(26000,7,1))

C:\ProgramData\Anaconda3\envs\thesis-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    641         max_queue_size=max_queue_size,
    642         workers=workers,
--> 643         use_multiprocessing=use_multiprocessing)
    644 
    645   def evaluate(self,

C:\ProgramData\Anaconda3\envs\thesis-gpu\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, **kwargs)
    630         steps=steps_per_epoch,
    631         validation_split=validation_split,
--> 632         shuffle=shuffle)
    633 
    634     if validation_data:

C:\ProgramData\Anaconda3\envs\thesis-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split, shuffle, extract_tensors_from_dataset)
   2459           training_utils.standardize_weights(ref, sw, cw, mode)
   2460           for (ref, sw, cw, mode) in zip(y, sample_weights, class_weights,
-> 2461                                          feed_sample_weight_modes)
   2462       ]
   2463       # Check that all arrays have the same length.

C:\ProgramData\Anaconda3\envs\thesis-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py in <listcomp>(.0)
   2458       sample_weights = [
   2459           training_utils.standardize_weights(ref, sw, cw, mode)
-> 2460           for (ref, sw, cw, mode) in zip(y, sample_weights, class_weights,
   2461                                          feed_sample_weight_modes)
   2462       ]

C:\ProgramData\Anaconda3\envs\thesis-gpu\lib\site-packages\tensorflow\python\keras\engine\training_utils.py in standardize_weights(y, sample_weight, class_weight, sample_weight_mode)
    839     if sample_weight is not None and len(sample_weight.shape) != 1:
    840       raise ValueError('Found a sample_weight array with shape ' +
--> 841                        str(sample_weight.shape) + '. '
    842                        'In order to use timestep-wise sample weights, '
    843                        'you should specify '

ValueError: Found a sample_weight array with shape (26000, 7, 1). In order to use timestep-wise sample weights, you should specify sample_weight_mode="temporal" in compile(). If you just mean to use sample-wise weights, make sure your sample_weight array is 1D.

我已经尝试过更改以合适的方式传递的sample_weights的形状（也尝试使其变平），将shuffle设置为False并删除验证拆分，但没有成功。

这是我正在使用的代码

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import math

# Import TensorFlow
import tensorflow as tf
from tensorflow.keras.optimizers import RMSprop, Adam, SGD
from tensorflow.keras.layers import SimpleRNN, LSTM, Dense
from tensorflow.keras.models import Sequential

from sklearn.utils import class_weight

class_weights = class_weight.compute_class_weight('balanced',
                                                 np.unique(y_train),
                                                 y_train)

cw = class_weight.compute_sample_weight({False:1, True:50},#'balanced',
                                                 #np.unique(y_train),
                                                 y.flatten())

model = Sequential()
model.add(LSTM(160, return_sequences=True, dropout=0.05))
model.add(LSTM(80, return_sequences=True, dropout=0.05, activation='relu'))
model.add(LSTM(40, return_sequences=True, dropout=0.05, activation='relu'))
model.add(LSTM(10, return_sequences=True, dropout=0.05))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc', tf.keras.metrics.AUC()], sample_weight_mode="temporal")

with tf.device("/device:GPU:0"):
    model.fit(X, y, epochs=500, batch_size=4096, verbose=1, validation_split=0.2, sample_weight=cw.reshape(26000,7,1))

Answer 1

样本权重适用于输出（因为权重就是损失！），而不是输入，因此无论您输入的尺寸是多少，都没有关系。错误消息明确地与输出尺寸有关。

Answer 2

sample_weight的维数不能大于2。如果sample_weight_mode函数中的compile为None，则sample_weight必须为一维。如果sample_weight_mode函数中的compile是'temporal'，则sample_weight必须是二维的。此外，y.shape[:sample_weight.ndim] == sample_weight.shape始终是正确的。有关更多详细信息，请参阅keras.engine.training_utils.standardize_weights的源代码。

设置sample_weight_mode =“ temporal”似乎不起作用

2 个答案: