Python Scikit-learn:ValueError:X和Y矩阵的维度不兼容:

时间:2017-11-01 14:00:08

标签: python scipy knn

我正在尝试解决形状不兼容的问题。我有4个功能,如果我正确理解.shape,x的形状应该是(n_samples,n_features)。但我怎么能纠正这个?

以下是发生错误的行:

plot_surface(est, X_train[:, 0], X_train[:, 1], ax=ax, threshold=0.5, contourf=True)

以下是每个变量的信息:

est:

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=1, p=2,
           weights='uniform')

X_train:

[[-1. -1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]]

这是完整的堆栈跟踪:

Traceback (most recent call last):
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\pydevd.py", line 1596, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\pydevd.py", line 974, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/Thomas/Desktop/!UFV/CIS480/project/NHL-Predictor.py", line 229, in <module>
    plot_datasets(est)
  File "C:/Users/Thomas/Desktop/!UFV/CIS480/project/NHL-Predictor.py", line 195, in plot_datasets
    plot_surface(est, X_train[:, 0], X_train[:, 1], ax=ax, threshold=0.5, contourf=True)
  File "C:/Users/Thomas/Desktop/!UFV/CIS480/project/NHL-Predictor.py", line 153, in plot_surface
    pred = est.predict_proba(X_pred)[:, 1]
  File "C:\Users\Thomas\Anaconda3\lib\site-packages\sklearn\neighbors\classification.py", line 190, in predict_proba
    neigh_dist, neigh_ind = self.kneighbors(X)
  File "C:\Users\Thomas\Anaconda3\lib\site-packages\sklearn\neighbors\base.py", line 353, in kneighbors
    n_jobs=n_jobs, squared=True)
  File "C:\Users\Thomas\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py", line 1240, in pairwise_distances
    return _parallel_pairwise(X, Y, func, n_jobs, **kwds)
  File "C:\Users\Thomas\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py", line 1083, in _parallel_pairwise
    return func(X, Y, **kwds)
  File "C:\Users\Thomas\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py", line 222, in euclidean_distances
    X, Y = check_pairwise_arrays(X, Y)
  File "C:\Users\Thomas\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py", line 122, in check_pairwise_arrays
    X.shape[1], Y.shape[1]))
ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 2 while Y.shape[1] == 10

Process finished with exit code 1

~~~~~ UPDATE:~~~~~

以下是包含导致错误的语句的整个函数:

def plot_datasets(est):
    """Plotsthe decision surface of ``est`` on each of the three datasets. """
    fig, axes = plt.subplots(1, 3, figsize=(10, 4))
    for (name, ds), ax in zip(datasets.items(), axes):

        X_train = ds['X_train']
        y_train = ds['y_train']
        X_test = ds['X_test']
        y_test = ds['y_test']

        # plot test lighter than training
        cm_bright = ListedColormap(['#FF0000', '#0000FF'])
        # Plot the training points
        ax.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=cm_bright)
        # and testing points
        ax.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.6)
        # plot limits
        ax.set_xlim(X_train[:, 0].min(), X_train[:, 0].max())
        ax.set_ylim(X_train[:, 1].min(), X_train[:, 1].max())
        # no ticks
        ax.set_xticks(())
        ax.set_yticks(())
        ax.set_ylabel('$x_1$')
        ax.set_xlabel('$x_0$')
        ax.set_title(name)
        if est is not None:
            est.fit(X_train, y_train)
            #X.reshape(10, 1)

            plot_surface(est, X_train[:, 0], X_train[:, 1], ax=ax, #ERROR IS CAUSED HERE 

            threshold=0.5, contourf=True)
            err = (y_test != est.predict(X_test)).mean()
            ax.text(0.88, 0.02, '%.2f' % err, transform=ax.transAxes)

    plt.show()
    fig.subplots_adjust(left=.02, right=.98)

该函数中重要变量的一些值:

X_train:

[[ 0.          0.         -0.70710678 -1.        ]
 [-0.52223297  0.          1.41421356  1.4       ]
 [-1.04446594  0.          1.41421356  1.        ]
 [-0.52223297  0.         -0.70710678 -1.4       ]
 [ 0.          0.         -0.70710678 -0.2       ]

y_train:

[0 1 1 0 0 1]

X_test:

[[ 1.04446594  0.          1.41421356 -1.8       ]
 [-1.04446594  0.          1.41421356 -0.2       ]
 [ 0.52223297  0.          1.41421356  0.6       ]
 [ 1.04446594  0.         -0.70710678 -0.6       ]]

y_test:

[0 1 1 0]

这是plot_surface:

def plot_surface(est, x_1, x_2, ax=None, threshold=0.0, contourf=False):
    """Plots the decision surface of ``est`` on features ``x1`` and ``x2``. """
    xx1, xx2 = np.meshgrid(np.linspace(x_1.min(), x_1.max(), 100),
                           np.linspace(x_2.min(), x_2.max(), 100))
    # plot the hyperplane by evaluating the parameters on the grid
    X_pred = np.c_[xx1.ravel(), xx2.ravel()]  # convert 2d grid into seq of points
    if hasattr(est, 'predict_proba'):  # check if ``est`` supports probabilities
        # take probability of positive class
        pred = est.predict_proba(X_pred)[:, 1]
    else:
        pred = est.predict(X_pred)
    Z = pred.reshape((100, 100))  # reshape seq to grid
    if ax is None:
        ax = plt.gca()
    # plot line via contour plot

    if contourf:
        ax.contourf(xx1, xx2, Z, levels=np.linspace(0, 1.0, 10), cmap=plt.cm.RdBu, alpha=0.6)
    ax.contour(xx1, xx2, Z, levels=[threshold], colors='black')
    ax.set_xlim((x_1.min(), x_1.max()))
    ax.set_ylim((x_2.min(), x_2.max()))

0 个答案:

没有答案