这会将所有NaN替换为零

Question

我正在为基因表达数据构建一个自动编码器。一些基因不表达，并且输入中含有NaN。我的输出（预测）全是NaN。这是我的损失函数：

def nan_mse(y_actual, y_predicted):
    per_instance = tf.where(tf.is_nan(y_actual),
    tf.zeros_like(y_actual),
    tf.square(tf.subtract(y_predicted, y_actual)))
    return tf.reduce_mean(per_instance, axis=0)

和型号：

input_data = Input(shape=(1,num_genes))

#Leaky-Parametric-RelU
#Encoder
x = Dense(num_genes)(input_data)
encoder = PReLU()(x)

#Battleneck layer
encoded = Dense(64, activation = 'sigmoid')(encoder)

#Decoder
x = Dense(num_genes)(encoded)
decoder = PReLU()(x)

autoencoder = Model(input_data, decoder)
autoencoder.compile(loss=nan_mse, optimizer = 'adam') 
autoencoder.summary()

history = autoencoder.fit(x_train,x_train, epochs =50, verbose = 2),                      
 callbacks = [MyCustomCallback()])

我的目标是使网络忽略NaN值，但是在输入中预知它们很重要。完成损失功能可以做到吗？

现在输出为NaN。用户在这里建议编辑代码，以便：

def nan_mse(y_actual, y_predicted):
    stack = tf.stack((tf.is_nan(y_actual), 
                      tf.is_nan(y_predicted)),
                     axis=1)
    is_nans = tf.keras.backend.any(stack, axis=1)
    per_instance = tf.where(is_nans,
                            tf.zeros_like(y_actual),
                            tf.square(tf.subtract(y_predicted, y_actual)))
    print(per_instance)
    return tf.reduce_mean(per_instance, axis=0)

现在我得到0.0000e + 00作为损失，但这不能解决根本问题。

Original question.

Answer 1

x = pd.read_csv（“ C：/Users/10.csv”，index_col = False）.groupby（['Column_name_optional']）。mean（）。reset_index（）

x = x.replace（np.nan，0）

由于使用自定义损失函数，因此自动编码器的预测都是NaN

1 个答案:

这会将所有NaN替换为零