广播错误:最佳传输库

时间:2017-12-05 15:47:19

标签: python numpy machine-learning

我正在进行重心聚类。但为什么这个错误?在进行矩阵乘法时,看起来很糟糕。

import numpy as np
import pandas as pd
import ot

def initialize_clusters(points, k):
    return points[np.random.randint(points.shape[0], size=k)]

def get_distances(centroid, points):
    return np.linalg.norm(points - centroid, axis=1)

if __name__ == "__main__":
    X = pd.read_csv('./csv/inst_clust2.csv',encoding='utf-8')[['lat','lng']]
    M = ot.dist(X,metric='euclidean')
    X = X.as_matrix()
    k = 3
    maxiter = 50
    centroids = initialize_clusters(X,k)
    classes = np.zeros(X.shape[0],dtype=np.float64)
    distances = np.zeros([X.shape[0],k],dtype=np.float64)
    for i in range(maxiter):
        for i, c in enumerate(centroids):
            distances[:,i] = get_distances(c, X)
        classes = np.argmin(distances,axis=1)
        for c in range(k):
            print(X)
            print(X[classes==c])
            centroids[c]=ot.barycenter(X[classes==c],M,1e-3)

给出错误

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ot/bregman.py:801: RuntimeWarning: invalid value encountered in log
  return np.exp(np.mean(np.log(alldistribT), axis=1))
Traceback (most recent call last):
  File "/Users/Chu/Documents/dssg2018/bc.py", line 25, in <module>
    centroids[c]=ot.bregman.barycenter(X[classes==c],M,1e-3)
ValueError: could not broadcast input array from shape (605) into shape (2)

1 个答案:

答案 0 :(得分:0)

该消息是由尝试将某些内容分配给错误形状的数组引起的。

>>> f = np.zeros((2,2))
>>> f[0] = np.zeros((605,))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (605) into shape (2)

我对ot.barycenter的文档的阅读是它将返回与其第一个参数匹配的形状数组。这意味着X[classes==c]的长度超过2.您是否尝试过ot.barycenter(X[classes==c][0],M,1e-3)

以下是相关文档:

barycenter(A, M, reg, weights=None, numItermax=1000, stopThr=0.0001, verbose=False, log=False)
    Compute the entropic regularized wasserstein barycenter of distributions A

    ... snip ...

    Parameters
    ----------
    A : np.ndarray (d,n)
        n training distributions of size d

    ... snip ...

    Returns
    -------
    a : (d,) ndarray