Question

我是Python和程序设计的新手，我想在一个变量上进行线性回归，从而发挥一点作用。

我当前正在链接中关注本教程

https://www.youtube.com/watch?v=8jazNUpO3lQ&list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw&index=2

我正在做他正在做的事情。但是，如下面的代码所示，我在编译时确实遇到了错误

（为简单起见，我将“-”放在输出的位置。我使用了Jupyter Notebook）

最后，在尝试编译“ reg.predict（3300）”时遇到了很多错误。我不明白怎么了。有人可以帮我吗？

干杯！

import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn import linear_model

    df = pd.read_csv("homeprices.csv")
    df

    --area  price
    0   2600    550000
    1   3000    565000
    2   3200    610000
    3   3600    680000
    4   4000    725000

    %matplotlib inline
    plt.xlabel('area(sqr ft)')
    plt.ylabel('price(US$)')
    plt.scatter(df.area, df.price, color='red', marker = '+')

    --<matplotlib.collections.PathCollection at 0x2e823ce66a0>

    reg = linear_model.LinearRegression()
    reg.fit(df[['area']],df.price)

    --LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
             normalize=False)

    reg.predict(3300)

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-16-ad5a8409ff75> in <module>
    ----> 1 reg.predict(3300)

    ~\Anaconda3\lib\site-packages\sklearn\linear_model\base.py in predict(self, X)
        211             Returns predicted values.
        212         """
    --> 213         return self._decision_function(X)
        214 
        215     _preprocess_data = staticmethod(_preprocess_data)

    ~\Anaconda3\lib\site-packages\sklearn\linear_model\base.py in _decision_function(self, X)
        194         check_is_fitted(self, "coef_")
        195 
    --> 196         X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
        197         return safe_sparse_dot(X, self.coef_.T,
        198                                dense_output=True) + self.intercept_

    ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
        543                     "Reshape your data either using array.reshape(-1, 1) if "
        544                     "your data has a single feature or array.reshape(1, -1) "
    --> 545                     "if it contains a single sample.".format(array))
        546             # If input is 1D raise error
        547             if array.ndim == 1:

    ValueError: Expected 2D array, got scalar array instead:
    array=3300.
    Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Answer 1

尝试reg.predict([[3300]])。用于允许标量值的api，但现在您需要提供2D数组

Answer 2

reg.fit(df[['area']],df.price)

我认为上面我们使用了 2 个变量，因此使用 2D 数组来拟合 [X]。我们也需要在 reg.predict 中为 [X] 使用二维数组。因此，

reg.predict([[3300]])

不了解错误消息（基本sklearn命令）

2 个答案: