Question

可复制：

ipt = Input(batch_shape=batch_shape)
x   = Conv2D(6, (8, 8), strides=(2, 2), activation='relu')(ipt)
x   = Flatten()(x)
out = Dense(6, activation='softmax')(x)

不可复制：

ipt = Input(batch_shape=batch_shape)
x   = Conv2D(6, (8, 8), strides=(2, 2), activation='relu')(ipt)
x   = Conv2D(6, (8, 8), strides=(2, 2), activation='relu')(x)
x   = Flatten()(x)
out = Dense(6, activation='softmax')(x)

使用较大的模型时，差异会大大放大，而实际数据会代替随机噪声-在单个小时期内，精度最高可达 30％差异（相对）。环境设置，已考虑的来源以及以下完整的最小可复制示例。 Relevant Git

有什么问题，以及如何解决？

可能的来源：（ [x] =排除）

[x] TF2与TF1； Keras 2.3.0+与Keras 2.2.5（均经过测试）
[x] 随机种子（numpy，tf，random，PYTHONHASHSEED）
[x] 数据值/改组（相同的值，不改组）
[x] 权重初始化（相同的值）
[x] GPU使用率（使用的CPU）
[x] CPU多线程（使用单线程；另请参见下面的“更多内容”）
[x] 不精确的数字（使用float64；此外，差异程度对于不精确的数字太大。）
[x] CUDA安装错误（所有official guide测试通过，TF检测到GPU和CUDA）

环境：

CUDA 10.0.130，cuDNN 7.6.0，Windows 10，GTX 1070
Python 3.7.4，Spyder 3.3.6，Anaconda 3.0 2019.10
Anaconda Powershell提示终端设置PYTHONHASHSEED并启动Spyder

观察：

float64与float32-没有明显区别
CPU vs.GPU-无明显差异
Conv1D也是不可复制的
可复制Dense代替Conv；其他未测试的层
对于larger model（仍然很小），损失差异在单个时期内是巨大的：

one_epoch_loss = [1.6814, 1.6018, 1.6577, 1.6789, 1.6878, 1.7022, 1.6689]
one_epoch_acc  = [0.2630, 0.3213, 0.2991, 0.3185, 0.2583, 0.2463, 0.2815]

代码：

batch_shape = (32, 64, 64, 3)
num_samples = 1152

ipt = Input(batch_shape=batch_shape)
x   = Conv2D(6, (8, 8), strides=(2, 2), activation='relu')(ipt)
x   = Conv2D(6, (8, 8), strides=(2, 2), activation='relu')(x)
x   = Flatten()(x)
out = Dense(6, activation='softmax')(x)
model = Model(ipt, out)
model.compile('adam', 'sparse_categorical_crossentropy')

X = np.random.randn(num_samples, *batch_shape[1:])
y = np.random.randint(0, 6, (num_samples, 1))

reset_seeds()
model.fit(x_train, y_train, epochs=5, shuffle=False)

导入/设置：

import os
os.environ['PYTHONHASHSEED'] = '0'
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import numpy as np
np.random.seed(1)
import random
random.seed(2)

import tensorflow as tf
session_conf = tf.ConfigProto(
      intra_op_parallelism_threads=1,
      inter_op_parallelism_threads=1)
sess = tf.Session(config=session_conf) # single-threading; TF1-only

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")
reset_seeds()

from keras.layers import Input, Dense, Conv2D, Flatten
from keras.models import Model
import keras.backend as K

K.set_floatx('float64')

为什么堆积CNN残骸的重现性（即使使用种子和CPU也是如此）？

0 个答案: