了解为什么Keras和Tensorflow之间的结果不同

时间:2017-08-28 06:17:51

标签: tensorflow

我目前正在尝试在Keras和Tensorflow中做一些工作,我偶然发现了一件我不理解的小事。如果您查看下面的代码,我试图通过显式的Tensorflow会话或使用模型predict_on_batch函数来预测网络的响应。

import os
import keras
import numpy as np
import tensorflow as tf
from keras import backend as K
from keras.layers import Dense, Dropout, Flatten, Input
from keras.models import Model

# Try to standardize output
np.random.seed(1)
tf.set_random_seed(1)

# Building the model
inputs = Input(shape=(224,224,3))
base_model = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', \
                                    input_tensor=inputs, input_shape=(224, 224, 3))
x = base_model.get_layer("fc2").output
x = Dropout(0.5, name='model_fc_dropout')(x)
x = Dense(2048, activation='sigmoid', name='final_fc')(x)
x = Dropout(0.5, name='final_fc_dropout')(x)
predictions = Dense(1, activation='sigmoid', name='fcout')(x)
model = Model(outputs=predictions, inputs=inputs)

##################################################################
model.compile(loss='binary_crossentropy',
          optimizer=tf.train.MomentumOptimizer(learning_rate=5e-4, momentum=0.9),
          metrics=['accuracy'])

image_batch = np.random.random((64,224,224,3))

# Outputs predicted by TF
outs = [predictions]
feed_dict={inputs:image_batch,  K.learning_phase():0}

init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)

    outputs = sess.run(outs, feed_dict)[0]
    print outputs.flatten()

# Outputs predicted by Keras
outputs = model.predict_on_batch(image_batch)
print outputs.flatten()

我的问题是我得到了两个不同的结果,即使我试图通过将种子设置为1并在CPU上运行操作来删除任何类型的随机源。即便如此,我得到以下结果:

[ 0.26079229  0.26078743  0.26079154  0.26079673  0.26078942  0.26079443
  0.26078886  0.26079088  0.26078972  0.26078728  0.26079121  0.26079452
  0.26078513  0.26078424  0.26079014  0.26079312  0.26079521  0.26078743
  0.26078558  0.26078537  0.26078674  0.26079136  0.26078632  0.26077667
  0.26079312  0.26078999  0.26079065  0.26078704  0.26078928  0.26078624
  0.26078892  0.26079202  0.26079065  0.26078689  0.26078963  0.26078749
  0.26078817  0.2607986   0.26078528  0.26078412  0.26079187  0.26079246
  0.26079226  0.26078457  0.26078099  0.26078072  0.26078376  0.26078475
  0.26078326  0.26079389  0.26079792  0.26078579  0.2607882   0.2607961
  0.26079237  0.26078218  0.26078638  0.26079753  0.2607787   0.26078618
  0.26078096  0.26078594  0.26078215  0.26079002]

[ 0.25331706  0.25228402  0.2534174   0.25033095  0.24851511  0.25099936
  0.25240892  0.25139931  0.24948661  0.25183493  0.25104815  0.25164133
  0.25214729  0.25265765  0.25128496  0.25249782  0.25247478  0.25314394
  0.25014618  0.25280923  0.2526398   0.25381723  0.25138992  0.25072744
  0.25069866  0.25307226  0.25063521  0.25133523  0.25050756  0.2536433
  0.25164688  0.25054023  0.25117773  0.25352773  0.25157067  0.25173825
  0.25234801  0.25182116  0.25284401  0.25297374  0.25079012  0.25146705
  0.25401884  0.25111189  0.25192681  0.25252578  0.25039044  0.2525287
  0.25165257  0.25357804  0.25001243  0.2495154   0.2531895   0.25270832
  0.25305843  0.25064403  0.25180396  0.25231308  0.25224048  0.25068772
  0.25212681  0.24812476  0.25027585  0.25243458]

是否有人知道后台会发生什么可能会改变结果? (如果再次运行它们,这些结果不会改变)

如果网络在GPU(Titan X)上运行,差异会更大,例如:第二个输出是:

[ 0.3302682   0.33054096  0.32677746  0.32830611  0.32972822  0.32807562
  0.32850873  0.33161065  0.33009702  0.32811245  0.3285495   0.32966742
  0.33050382  0.33156893  0.3300975   0.3298254   0.33350074  0.32991216
  0.32990077  0.33203539  0.32692945  0.33036903  0.33102706  0.32648
  0.32933888  0.33161271  0.32976636  0.33252293  0.32859167  0.33013415
  0.33080408  0.33102706  0.32994759  0.33150592  0.32881773  0.33048317
  0.33040857  0.32924038  0.32986534  0.33131596  0.3282761   0.3292698
  0.32879189  0.33186096  0.32862625  0.33067161  0.329018    0.33022234
  0.32904804  0.32891914  0.33122411  0.32900628  0.33088413  0.32931429
  0.3268061   0.32924181  0.32940546  0.32860965  0.32828435  0.3310211
  0.33098024  0.32997403  0.33025959  0.33133432]

而在第一个中,差异仅发生在第5位和后面的小数位:

[ 0.26075357  0.26074868  0.26074538  0.26075155  0.260755    0.26073951
  0.26074919  0.26073971  0.26074231  0.26075247  0.2607362   0.26075858
  0.26074955  0.26074123  0.26074299  0.26074946  0.26074076  0.26075014
  0.26074076  0.26075229  0.26075041  0.26074776  0.26075897  0.26073995
  0.260746    0.26074466  0.26073912  0.26075709  0.26075712  0.26073799
  0.2607322   0.26075566  0.26075059  0.26073873  0.26074558  0.26074558
  0.26074359  0.26073721  0.26074392  0.26074731  0.26074862  0.26074174
  0.26074126  0.26074588  0.26073804  0.26074919  0.26074269  0.26074606
  0.26075307  0.2607446   0.26074025  0.26074648  0.26074952  0.26073608
  0.26073566  0.26073873  0.26074576  0.26074475  0.26074636  0.26073411
  0.2607542   0.26074755  0.2607449   0.2607407 ]

1 个答案:

答案 0 :(得分:1)

此处的结果不同,因为initializations不同。

Tf使用此init_op进行变量初始化。

sess.run(init_op)

但是Keras在其模型类中使用了自己的init_op,而不是代码中定义的init_op