我正在使用Spectral Clustering Library,而相似矩阵是它的主要论点。我的矩阵看起来像:
[[ 1.00000000e+00 8.47085137e-01 8.49644498e-01 8.49746438e-01
2.96473454e-01 8.50540412e-01 8.49462072e-01 8.50839475e-01
8.45951343e-01 5.76448265e-01 8.48265736e-01 8.43378943e-01
3.75348067e-01 1.17626480e-01 2.50357519e-01 8.50495202e-01
9.97541755e-01 8.49835674e-01 8.48770171e-01 8.45869271e-01
-5.97205241e-02]
[ 8.47085137e-01 1.00000000e+00 9.98547894e-01 9.98803332e-01
2.22305018e-01 9.98755219e-01 9.98502380e-01 9.98402601e-01
9.98778885e-01 5.66416311e-01 9.98639207e-01 9.98452172e-01
-6.10479042e-02 2.46741344e-02 -4.14116930e-03 9.98357419e-01
8.48955204e-01 9.98525354e-01 9.98900440e-01 9.98426618e-01
-6.51839614e-02]
[ 8.49644498e-01 9.98547894e-01 1.00000000e+00 9.98764222e-01
1.59017501e-01 9.98777492e-01 9.98797005e-01 9.98756310e-01
9.98785822e-01 5.71955127e-01 9.98834038e-01 9.98652820e-01
-5.95467715e-02 1.98107829e-02 -3.88527970e-03 9.98810942e-01
8.51337460e-01 9.98882675e-01 9.98815975e-01 9.98789494e-01
-6.69662309e-02]
[ 8.49746438e-01 9.98803332e-01 9.98764222e-01 1.00000000e+00
4.73518047e-01 9.98684853e-01 9.98839959e-01 9.99029920e-01
9.98804479e-01 5.67855583e-01 9.98759386e-01 9.98796277e-01
-6.07517782e-02 1.71388383e-02 -3.20996100e-03 9.98669121e-01
8.51600753e-01 9.98681806e-01 9.99072484e-01 9.98702177e-01
-6.29855810e-02]
[ 3.52784328e-01 2.41076867e-01 2.01621082e-01 4.11538647e-01
9.92999574e-01 2.09351787e-01 2.12464918e-01 1.84566399e-01
2.82162287e-01 8.88835155e-01 1.90613041e-01 2.12150578e-01
2.92104260e-01 6.25221827e-02 8.70607365e-01 2.88645877e-01
3.09283827e-01 2.81253950e-01 1.80307149e-01 2.49082955e-01
5.46192492e-02]
...
[ -5.97205241e-02 -6.51839614e-02 -6.69662309e-02 -6.29855810e-02
7.86918277e-02 -6.49002943e-02 -6.12003747e-02 -6.34500592e-02
-6.75593439e-02 7.23869691e-02 -6.20686862e-02 -5.94039824e-02
-1.00101778e-01 -1.14667128e-01 5.57606897e-02 -6.32884559e-02
-5.33734526e-02 -5.90822523e-02 -6.17068052e-02 -5.76615359e-02
1.00000000e+00]]
我的代码类似于文档示例:
cl = SpectralClustering(n_clusters=4,affinity='precomputed')
y = cl.fit_predict(matrix)
但发生以下错误:
/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/utils/validation.py:629: UserWarning: Array is not symmetric, and will be converted to symmetric by average with its transpose.
warnings.warn("Array is not symmetric, and will be converted "
/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/utils/graph.py:172: RuntimeWarning: invalid value encountered in sqrt
w = np.sqrt(w)
Traceback (most recent call last):
File "/home/mahmood/PycharmProjects/sentence2vec/graphClustering.py", line 23, in <module>
y = cl.fit_predict(matrix)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/base.py", line 371, in fit_predict
self.fit(X)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/spectral.py", line 454, in fit
assign_labels=self.assign_labels)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/spectral.py", line 258, in spectral_clustering
eigen_tol=eigen_tol, drop_first=False)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/manifold/spectral_embedding_.py", line 254, in spectral_embedding
tol=eigen_tol)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1545, in eigsh
symmetric=True, tol=tol)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 1033, in get_OPinv_matvec
return LuInv(A).matvec
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/interface.py", line 142, in __new__
obj.__init__(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 922, in __init__
self.M_lu = lu_factor(M)
File "/usr/lib/python2.7/dist-packages/scipy/linalg/decomp_lu.py", line 58, in lu_factor
a1 = asarray_chkfinite(a)
File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1022, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
首先警告是可以接受的,因为矩阵不对称,但矩阵中没有infs或NaN。
答案 0 :(得分:2)
NaN值出现因为您的矩阵不是相似性矩阵:您的数据包含负相似性!在获取这些值的<?php
if(session_status()===PHP_SESSION_NONE) session_start();
if(isset($_SESSION['expires']) && $_SESSION['expires'] > time()){
//session is still active. extend expiration time
$_SESSION['expiration'] = time() + 60*15;
//retrieve data
$user = $_SESSION['username'];
.... run your script
}else{
//either the session doesn't exist or it has expired because the user
//left the page or stopped browsing the site
//destroy the session and redirect the user
session_destroy();
header('Location: /login.php');
}
时,您会得到sqrt
,因此会出错。
警告不只是为了好玩 - 矩阵分解技术有一些要求,允许它们工作并返回有意义的结果。
首先修复您的负面相似性,然后重试。