使用VGG16预训练权重进行Imagenet分类的问题

时间:2018-02-18 10:15:28

标签: python tensorflow computer-vision conv-neural-network imagenet

我试图在张量流中使用VGG16网络运行一个香草图像网络分类(通过Keras骨干网给出VGG16)。

然而,当我试图对样本大象图像进行分类时,它会给出完全意想不到的结果。

我无法弄清楚可能是什么问题。

以下是我使用的完整代码:

pragma solidity ^0.4.19;

contract test {

event randomNumbers(uint[8] numbers, uint[8] randomNumbers);

function testa() public returns (uint[8] bla) {

    //uint something = 12345678123456781234567812345678;
    uint something = uint(keccak256("rockandroll"));

    uint[8] memory array;
    uint[8] memory random;

    for(uint i=0; i<8; i++) {
        uint digit = something % 10000;
        // do something with digit
        array[i] = digit;
        something /= 10000;
        random[i] =digit % 10;
    }

    randomNumbers(array,random);

    return array;
}

以下是我得到的样本输出:

  

Tensor(“input_1:0”,shape =(?,224,224,3),dtype = float32)
  张量(“预测/ Softmax:0”,形状=(?,1000),dtype = float32)

     

[[('n02281406','sulphur_butterfly',0.0022673723),('n01882714','koala',0.0021256246),('n04325704','stole',0.0020583202),('n01496331','electric_ray', 0.0020416214),('n01797886','ruffed_grouse',0.0020229272)]]

从可能性来看,它就像传递的图像数据有问题一样(因为它们都非常低)。

但我无法弄清楚出了什么问题 而且我非常肯定这张照片是一只象人类的大象!

2 个答案:

答案 0 :(得分:0)

我认为有2个错误,第一个是你必须通过除以255所有像素来重新缩放你的图像。

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array /= 255.
image_array = np.expand_dims(image_array, axis=0)

第二点我在看预测值时得到了它。你有一个1000元素的向量,并且它们在重新缩放后都有0.1%的预测。这意味着你有一个未经训练的模型。我不确切知道如果加载张量流如何,但在Keras上,例如你可以这样做:

app = applications.vgg16
model = app.VGG16(
        include_top=False,    # this is to have the classifier Standard from imagenet
        weights='imagenet',   # this load weight, else it's random weight
        pooling="avg") 

根据我的阅读,您必须下载另一个包含重量的文件,例如github。

我希望它有所帮助,

<强> EDIT1:

我尝试了同样的模型使用Keras:

from keras.applications.vgg16 import VGG16, decode_predictions
import numpy as np

model = VGG16(weights='imagenet')

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = image_array/255.
x = np.expand_dims(image_array, axis=0)

preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=5)[0])

如果我评论重新缩放,我的预测很糟糕:

  

预测:[(&#39; n03788365&#39;,&#39; mosquito_net&#39;,0.22725257),(&#39; n15075141&#39;,&#39; toilet_tissue&#39;,0.026636025), (&#39; n04209239&#39;,&#39; shower_curtain&#39;,0.019786758),(&#39; n02804414&#39;,&#39;摇篮&#39;,0.01353887),(&#39; n03131574& #39;,&#39; crib&#39;,0.01316699)]

没有重新缩放,这很好:

  

预测:[(&#39; n02504458&#39;,&#39; African_elephant&#39;,0.95870858),(&#39; n01871265&#39;,&#39; tusker&#39;,0.040065952), (&#39; n02504013&#39;,&#39; Indian_elephant&#39;,0.0012253703),(&#39; n01704323&#39;,&#39; triceratops&#39;,5.0949382e-08),(&# 39; n02454379&#39;,&#39; armadillo&#39;,5.0408511e-10)]

现在,如果我减轻了体重,我就有了#34;相同的#34;正如我对Tensorflow的看法:

  

预测:[(&#39; n07717410&#39;,&#39; acorn_squash&#39;,0.0010033853),(&#39; n02980441&#39;,&#39; castle&#39;,0.0010028203), (&#39; n02124075&#39;,&#39; Egyptian_cat&#39;,0.0010028186),(&#39; n04179913&#39;,&#39; sewing_machine&#39;,0.0010027955),(&#39; n02492660& #39;,&#39; howler_monkey&#39;,0.0010027081)]

对我来说,这意味着你没有应用重量。也许他们已下载但未使用。

答案 1 :(得分:0)

似乎我们可以(或需要?)使用来自Keras的会话(其具有与权重关联的加载图)而不是在Tensorflow中创建新会话并使用从下面的Keras模型获得的图

VGG = model.graph  

我认为上面的图形没有权重(这就是预测错误的原因)和Keras会话中的图形作为适当的权重(因此这两个图形实例应该是不同的)

以下是完整代码:

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils
from tensorflow.python.keras._impl.keras import backend as K


model = tf.keras.applications.VGG16()
sess = K.get_session()
VGG = model.graph #Not needed and also doesnt have weights in it

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)
image_array = image_array.astype(np.float32)
image_array = tf.keras.applications.vgg16.preprocess_input(image_array)

pred = (sess.run(output,{input:image_array}))
print(imagenet_utils.decode_predictions(pred))

这给出了预期的结果:

  

[[(&#39; n02504458&#39;,&#39; African_elephant&#39;,0.8518132),(&#39; n01871265&#39;,&#39; tusker&#39;,0.1398836),( &#39; n02504013&#39;,&#39; Indian_elephant&#39;,0.0082286),(&#39; n01704323&#39;,&#39; triceratops&#39;,6.965483e-05),(&#39 ; n02397096&#39;,&#39; warthog&#39;,1.8662439e-06)]]

感谢Idavid提供有关使用preprocess_input()函数和Nicolas的提示,了解有关卸载权重的提示。