matplotlib错误:x和y的大小必须相同

时间:2019-03-29 10:55:19

标签: python pandas matplotlib scikit-learn

请问,如何解决Python代码中的“ ValueError:x和y大小必须相同”错误?

该代码的思想是,从温度和NO数据的不同传感器应用多元线性回归模型。要训​​练模型并查看其中相关的结果以及整个预测。

我不确定代码是否运行良好,因为我正在学习,对此我并不了解。如果有人对如何改进代码有任何建议,请也对我说。

非常感谢

from sklearn import linear_model
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import pandas as pd
import matplotlib.pyplot as plt

# Name of de file
filename = 'NORM_AC_HAE.csv'
file = 'NORM_NABEL_HAE_lev1.csv'

# Read the data
data=pd.read_csv(filename)
data_other=pd.read_csv(file)

col = ['Aircube.009.0.no.we.aux.ch6', 'Aircube.009.0.sht.temperature.ch1']
X = data.loc[:, col]
Y = data_other.loc[:,'NO.ppb']

# Fitting the Liner Regression to training set
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, train_size = 0.6, random_state = np.random.seed(0))
mlr = LinearRegression()
mlr.fit(X_train, y_train)

# Visualization of the test set results
plt.figure(2)
plt.scatter(y_test, X_test) #The VALUE ERROR appears here

错误代码为:

Traceback (most recent call last):
  File "C:\Users\andre\Desktop\UV\4o\TFG\EMPA\dataset_Mila\MLR_no_temp_hae_no.py", line 65, in <module>
    plt.scatter(y_test, X_test)
  File "C:\Users\andre\AppData\Local\Programs\Python\Python37-32\lib\site-packages\matplotlib\pyplot.py", line 2864, in scatter
    is not None else {}), **kwargs)
  File "C:\Users\andre\AppData\Local\Programs\Python\Python37-32\lib\site-packages\matplotlib\__init__.py", line 1810, in inner
    return func(ax, *args, **kwargs)
  File "C:\Users\andre\AppData\Local\Programs\Python\Python37-32\lib\site-packages\matplotlib\axes\_axes.py", line 4182, in scatter
    raise ValueError("x and y must be the same size")
ValueError: x and y must be the same size
[Finished in 6.9s]

1 个答案:

答案 0 :(得分:0)

  

X_test.shape = [36648行x 2列]

plt.scatter(此处为y_testX_test)中的两个数据参数都必须是一维数组;来自docs

  

x,y :类似array_,形状为(n,)

在这里,您尝试传递X_test的二维矩阵,因此会出现不同大小的错误。

您无法获得具有数组/向量的矩阵的散点图;您可以做的是生成两个单独的散点图,每个X_test中的每一列:

plt.figure(2)
plt.scatter(y_test, X_test.iloc[:,0].values)

plt.figure(3)
plt.scatter(y_test, X_test.iloc[:,1].values)