Question

我正在尝试使用主要组件分析（PCA）使用python实现面部识别。其中一个步骤是通过减去平均面部向量T：m来规范化输入（测试）图像n = T - m。

这是我的代码：

#Step1: put database images into a 2D array
filenames = glob.glob('C:\\Users\\Karim\\Downloads\\att_faces\\New folder/*.pgm')
filenames.sort()
img = [Image.open(fn).convert('L').resize((90, 90)) for fn in filenames]
images = np.asarray([np.array(im).flatten() for im in img])

#Step 2: find the mean image and the mean-shifted input images
m = images.mean(axis=0)
shifted_images = images - m

#Step 7: input image
input_image = Image.open('C:\\Users\\Karim\\Downloads\\att_faces\\1.pgm').convert('L').resize((90, 90))
T = np.asarray(input_image)
n = T - mean_image

但我收到错误Traceback (most recent call last): File "C:/Users/Karim/Desktop/Bachelor 2/New folder/new3.py", line 46, in <module> n = T - m ValueError: operands could not be broadcast together with shapes (90,90) (8100)

Answer 1

为扁平数组计算了

mean_image：

images = np.asarray([np.array(im).flatten() for im in img])
mean_image = images.mean(axis=0)

和input_image是90x90。因此错误。您也应该压平输入图像，或者不要压平原始图像（我不太明白为什么要这样做），或者仅为此操作将mean_image调整为90x90。

Answer 2

正如@Lev所说，你已经弄平了阵列。你实际上并不需要这样做来执行平均值。假设您有2个3x4图像的数组，那么您将拥有以下内容：

In [291]: b = np.random.rand(2,3,4)

In [292]: b.shape
Out[292]: (2, 3, 4)

In [293]: b
Out[293]: 
array([[[ 0.18827554,  0.11340471,  0.45185287,  0.47889188],
        [ 0.35961448,  0.38316556,  0.73464482,  0.37597429],
        [ 0.81647845,  0.28128797,  0.33138755,  0.55403119]],

       [[ 0.92025024,  0.55916671,  0.23892798,  0.59253267],
        [ 0.15664109,  0.12457157,  0.28139198,  0.31634361],
        [ 0.33420446,  0.27599807,  0.40336601,  0.67738928]]])

在第一个轴上执行平均值，保留数组的形状：

In [300]: b.mean(0)
Out[300]: 
array([[ 0.55426289,  0.33628571,  0.34539042,  0.53571227],
       [ 0.25812778,  0.25386857,  0.5080184 ,  0.34615895],
       [ 0.57534146,  0.27864302,  0.36737678,  0.61571023]])

In [301]: b - b.mean(0)
Out[301]: 
array([[[-0.36598735, -0.222881  ,  0.10646245, -0.0568204 ],
        [ 0.10148669,  0.129297  ,  0.22662642,  0.02981534],
        [ 0.24113699,  0.00264495, -0.03598923, -0.06167904]],

       [[ 0.36598735,  0.222881  , -0.10646245,  0.0568204 ],
        [-0.10148669, -0.129297  , -0.22662642, -0.02981534],
        [-0.24113699, -0.00264495,  0.03598923,  0.06167904]]])

对于许多用途，这也比将图像保持为数组列表更快，因为numpy操作是在一个数组上完成的，而不是通过数组列表完成的。大多数方法（例如mean，cov等）接受axis参数，您可以列出所有维度来执行它而无需展平。

要将此应用于您的脚本，我会做这样的事情，保持原始的维度：

images = np.asarray([Image.open(fn).convert('L').resize((90, 90)) for fn in filenames])
# so images.shape = (len(filenames), 90, 90)

m = images.mean(0)
# numpy broadcasting will automatically subract the (90, 90) mean image from each of the `images`
# m.shape = (90, 90)
# shifted_images.shape = images.shape = (len(filenames), 90, 90)
shifted_images = images - m 

#Step 7: input image
input_image = Image.open(...).convert('L').resize((90, 90))
T = np.asarray(input_image)
n = T - m

作为最终评论，如果速度是一个问题，使用np.dstack加入你的图片会更快：

In [354]: timeit b = np.asarray([np.empty((50,100)) for i in xrange(1000)])
1 loops, best of 3: 824 ms per loop

In [355]: timeit b = np.dstack([np.empty((50,100)) for i in xrange(1000)]).transpose(2,0,1)
10 loops, best of 3: 118 ms per loop

但很可能加载图像大部分时间都是如此，如果是这种情况，你可以忽略它。

Python - ValueError：操作数无法与形状一起广播

2 个答案: