Question

我有一张尺寸为34 * 41（1394像素）的照片。我使用img_to_graph函数如下：

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
from sklearn.cluster import spectral_clustering
from sklearn.feature_extraction import image
from PIL import Image

pic = Image.open('Chrome.png')
pic = pic.convert('L')
data = np.array(pic).astype(np.float)
affinity = image.img_to_graph(data)
print(affinity)
print(affinity.shape)
print(type(affinity))
print(affinity.toarray())

然后我得到：

  (0, 1)    0.0039215686274509665
  (1, 2)    0.0
  (2, 3)    0.0039215686274509665
  (3, 4)    0.0
  (4, 5)    0.0039215686274509665
  (5, 6)    0.0039215686274509665
...
  (1388, 1388)  0.43529411764705883
  (1389, 1389)  0.42745098039215684
  (1390, 1390)  0.4235294117647059
  (1391, 1391)  0.4196078431372549
  (1392, 1392)  0.41568627450980394
  (1393, 1393)  0.4117647058823529
(1394, 1394)
<class 'scipy.sparse.coo.coo_matrix'>
[[0.81960784 0.00392157 0.         ... 0.         0.         0.        ]
 [0.00392157 0.81568627 0.         ... 0.         0.         0.        ]
 [0.         0.         0.81568627 ... 0.         0.         0.        ]
 ...
 [0.         0.         0.         ... 0.41960784 0.00392157 0.        ]
 [0.         0.         0.         ... 0.00392157 0.41568627 0.00392157]
 [0.         0.         0.         ... 0.         0.00392157 0.41176471]]

为什么行和列索引首先是不同的，在最后几行中是相同的？为什么亲和力不包含affinity.toarray（）中的所有非零值，如affinity.toarray（）[0,0]（0.81960784）（亲和力直接从（0,1）开始0.0039215686274509665）

我在Sklearn文档中找不到有用的东西。有人可以帮忙吗？

Answer 1

我无法告诉您您的价值观发生了什么，因为我没有您的图片。但是我可以告诉你它们是如何计算的。

其复制并粘贴自 https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/image.py。

def _make_edges_3d(n_x, n_y, n_z=1):
    """Returns a list of edges for a 3D image.
    Parameters
    ===========
    n_x : integer
        The size of the grid in the x direction.
    n_y : integer
        The size of the grid in the y direction.
    n_z : integer, optional
        The size of the grid in the z direction, defaults to 1
    """
    vertices = np.arange(n_x * n_y * n_z).reshape((n_x, n_y, n_z))
    edges_deep = np.vstack((vertices[:, :, :-1].ravel(),
                            vertices[:, :, 1:].ravel()))
    edges_right = np.vstack((vertices[:, :-1].ravel(),
                             vertices[:, 1:].ravel()))
    edges_down = np.vstack((vertices[:-1].ravel(), vertices[1:].ravel()))
    edges = np.hstack((edges_deep, edges_right, edges_down))
    return edges

img = np.random.rand(20,20)*255
img = img.astype(int)
plt.imshow(img)
img = np.atleast_3d(img)
print(img.shape)
# (20, 20, 1)

n_x, n_y, n_z = img.shape
edges = _make_edges_3d(n_x,n_y,n_z)

# calculate the gradient
edge1 = edges[0,0]
print(edges[0,0]) # 0

edge2 = edges[1,0]
print(edges[1,0]) # 1

a1 = img[edge1 // (n_y * n_z), (edge1 % (n_y * n_z)) // n_z, (edge1 % (n_y * n_z)) % n_z]
a2 = img[edge2 // (n_y * n_z), (edge2 % (n_y * n_z)) // n_z, (edge2 % (n_y * n_z)) % n_z]

pixel = np.abs(a1-a2)

print(pixel == image.img_to_graph(img,return_as=np.ndarray)[0,1]) # True

如何在sklearn.feature_extraction.image中解释img_to_graph的返回值？

1 个答案: