我正在处理CSV文件。 我在下面编写了线性回归算法:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset=pd.read_csv(r"C:\Users\asrivastava\Desktop\Python tutorials\Udemy data set machine leanring\Machine Learning A-Z Template Folder\Part 2 - Regression\Section 4 - Simple Linear Regression\Simple_Linear_Regression\Salary_Data.csv")
X=dataset.iloc[:,:-1].values
y=dataset.iloc[:,1].values
from sklearn.model_selection import train_test_split
print dataset
print X
print y
X_train,X_test,y_train,y_test= train_test_split(X,y,test_size=1/3,random_state=0)
from sklearn.linear_model import LinearRegression
regressor=LinearRegression()
regressor.fit(X_train,y_train)
#predicting the test set results
y_pred= regressor.predict(X_test)
(plt.scatter(X_train,y_train,color="red"))
(plt.plot(X_train,regressor.predict(X_train),color="blue"))
(plt.title("Salary vs Experince (Training set)"))
(plt.xlabel("Years of Experience"))
(plt.ylabel("Salary"))
plt.show()
这是数据集
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
5 2.9 56642.0
6 3.0 60150.0
7 3.2 54445.0
8 3.2 64445.0
9 3.7 57189.0
10 3.9 63218.0
11 4.0 55794.0
12 4.0 56957.0
13 4.1 57081.0
14 4.5 61111.0
15 4.9 67938.0
16 5.1 66029.0
17 5.3 83088.0
18 5.9 81363.0
19 6.0 93940.0
20 6.8 91738.0
21 7.1 98273.0
22 7.9 101302.0
23 8.2 113812.0
24 8.7 109431.0
25 9.0 105582.0
26 9.5 116969.0
27 9.6 112635.0
28 10.3 122391.0
29 10.5 121872.0
答案 0 :(得分:0)
ValueError可能源自
(plt.scatter(X_train,y_train,color =“ red”))
如果X_train和y_train的形状不匹配,我们将得到此结果。 请检查X_train.shape和y_train.shape是否具有相同的矢量形式。
我们仍然可以通过在X_train和y_train的特定列之间创建散点图来使代码正常工作 例如。
(plt.scatter(X_train.iloc [:,1],y_train,color =“ red”))