1）IMPORT库

Question

您好我在 Ubuntu 14.04 上使用 caffe ， CUDA 7.0版（最新） cudnn version 2 （最新） GPU：NVIDIA GT 730

首先，我完成了初始化，然后加载了imagenet模型（Alexnet）。我还使用set_mode_gpu()初始化gpu 之后我拍了一张照片。我将此图像复制到caffe源blob上。然后，我使用以下代码执行此图像的正向传递：net.forward(end='fc7') 然后我提取4096维fc7输出。（fc7层的激活特征）

我面临的问题是，当我多次运行相同的代码时，每次我获得不同的结果。也就是说，在GPU模式中，每次激活特征对于同一图像是不同的。当我使用正向传递时，网络的功能应该是确定性的吗？因此，我应该每次为同一图像获得相同的输出。

另一方面，当我使用set_mode_cpu()在cpu上运行caffe时，一切都运行正常，即每次都得到相同的输出使用的代码和获得的输出如下所示。我无法理解问题所在。问题是由于GPU四舍五入造成的吗？但错误非常大。或者是由于最新CUDNN版本的一些问题？或者它完全是另一回事？

以下是CODE

1）IMPORT库

from cStringIO import StringIO
import numpy as np
import scipy.ndimage as nd
import PIL.Image
from IPython.display import clear_output, Image, display
from google.protobuf import text_format
import scipy
import matplotlib.pyplot as plt
import caffe

2）IMPORT Caffe模型和定义实用程序功能

model_path = '../../../caffe/models/bvlc_alexnet/' 
net_fn   = model_path + 'deploy.prototxt'
param_fn = model_path + 'bvlc_reference_caffenet.caffemodel'

model = caffe.io.caffe_pb2.NetParameter()
text_format.Merge(open(net_fn).read(), model)
model.force_backward = True
open('tmp.prototxt', 'w').write(str(model))

net = caffe.Classifier('tmp.prototxt', param_fn,
                       mean = np.float32([104.0, 116.0, 122.0]), # ImageNet mean, training set dependent
                       channel_swap = (2,1,0),# the reference model has channels in BGR order instead of RGB
                      image_dims=(227, 227)) 

caffe.set_mode_gpu()
# caffe.set_mode_cpu()

# a couple of utility functions for converting to and from Caffe's input image layout
def preprocess(net, img):
    return np.float32(np.rollaxis(img, 2)[::-1]) - net.transformer.mean['data']
def deprocess(net, img):
    return np.dstack((img + net.transformer.mean['data'])[::-1])

3）加载图像和设置常量

target_img = PIL.Image.open('alpha.jpg')
target_img = target_img.resize((227,227), PIL.Image.ANTIALIAS)
target_img=np.float32(target_img)
target_img=preprocess(net, target_img)

end='fc7'

4）设置源图像并进行正向传递以获得fc7激活功能

src = net.blobs['data']
src.reshape(1,3,227,227) # resize the network's input image size
src.data[0] = target_img
dst = net.blobs[end]
net.forward(end=end)
target_data = dst.data[0]
print dst.data

以下是我多次运行上述代码时为'print dst.data'获得的输出

第一次执行代码时输出

[[-2.22313166 -1.66219997 -1.67641115 ..., -3.62765646 -2.78621101
  -5.06158161]]

第二次执行代码

时输出

[[ -82.72431946 -372.29296875 -160.5559845  ..., -367.49728394 -138.7151947
  -343.32080078]]

第3次执行代码

时输出

[[-10986.42578125 -10910.08105469 -10492.50390625 ...,  -8597.87011719
   -5846.95898438  -7881.21923828]]

第4次执行代码

时输出

[[-137360.3125     -130303.53125    -102538.78125    ...,  -40479.59765625
    -5832.90869141   -1391.91259766]]

输出值越来越大，经过一段时间再次变小。我无法理解这个问题。

Answer 1

将您的网络切换到测试模式，以防止丢失的影响，这是训练模式所不具备的确定性。

初始化网络后立即添加以下行：

net.set_phase_test（）

这样你就可以获得相同的结果。

Soner

Caffe - 激活特征值的不一致性 - GPU模式

1）IMPORT库

2）IMPORT Caffe模型和定义实用程序功能

3）加载图像和设置常量

4）设置源图像并进行正向传递以获得fc7激活功能

`[[-2.22313166 -1.66219997 -1.67641115 ..., -3.62765646 -2.78621101 -5.06158161]]`
第二次执行代码

`[[ -82.72431946 -372.29296875 -160.5559845 ..., -367.49728394 -138.7151947 -343.32080078]]`
第3次执行代码

`[[-10986.42578125 -10910.08105469 -10492.50390625 ..., -8597.87011719 -5846.95898438 -7881.21923828]]`
第4次执行代码

`[[-137360.3125 -130303.53125 -102538.78125 ..., -40479.59765625 -5832.90869141 -1391.91259766]]`

输出值越来越大，经过一段时间再次变小。我无法理解这个问题。

1 个答案:

Caffe - 激活特征值的不一致性 - GPU模式

1）IMPORT库

2）IMPORT Caffe模型和定义实用程序功能

3）加载图像和设置常量

4）设置源图像并进行正向传递以获得fc7激活功能

[[-2.22313166 -1.66219997 -1.67641115 ..., -3.62765646 -2.78621101 -5.06158161]] 第二次执行代码

[[ -82.72431946 -372.29296875 -160.5559845 ..., -367.49728394 -138.7151947 -343.32080078]] 第3次执行代码

[[-10986.42578125 -10910.08105469 -10492.50390625 ..., -8597.87011719 -5846.95898438 -7881.21923828]] 第4次执行代码

[[-137360.3125 -130303.53125 -102538.78125 ..., -40479.59765625 -5832.90869141 -1391.91259766]] 输出值越来越大，经过一段时间再次变小。我无法理解这个问题。

1 个答案:

`[[-2.22313166 -1.66219997 -1.67641115 ..., -3.62765646 -2.78621101 -5.06158161]]`
第二次执行代码

`[[ -82.72431946 -372.29296875 -160.5559845 ..., -367.49728394 -138.7151947 -343.32080078]]`
第3次执行代码

`[[-10986.42578125 -10910.08105469 -10492.50390625 ..., -8597.87011719 -5846.95898438 -7881.21923828]]`
第4次执行代码

`[[-137360.3125 -130303.53125 -102538.78125 ..., -40479.59765625 -5832.90869141 -1391.91259766]]`

输出值越来越大，经过一段时间再次变小。我无法理解这个问题。