我是数据科学的新手,我已经下载了将在下周告诉观众的代码。
但在下面的代码中,我无法理解以下函数的作用,以及它将如何预测值。
每个数据集的值为7
。为什么只有9个插入支架?
regr1 = linear_model.LinearRegression()
regr1.fit(x1, y1)
predicted_value1 = regr1.predict(9)
这些线会做什么?
以下是完整代码:
import pandas as pd
def get_data(file_name):
data = pd.read_csv(file_name)
flash_x_parameter = []
flash_y_parameter = []
arrow_x_parameter = []
arrow_y_parameter = []
for x1,y1,x2,y2 in zip(data['flash_episode_number'],
data['flash_us_viewers'],
data['arrow_episode_number'],data['arrow_us_viewers']):
flash_x_parameter.append([float(x1)])
flash_y_parameter.append(float(y1))
arrow_x_parameter.append([float(x2)])
arrow_y_parameter.append(float(y2))
return flash_x_parameter,
flash_y_parameter,arrow_x_parameter,arrow_y_parameter
def more_viewers(x1,y1,x2,y2):
regr1 = linear_model.LinearRegression()
regr1.fit(x1, y1)
predicted_value1 = regr1.predict(9)
regr2 = linear_model.LinearRegression()
regr2.fit(x2, y2)
predicted_value2 = regr2.predict(9)
print predicted_value1,"are the flash viewers"
print predicted_value2,"are the arrow viewers"
if predicted_value1 > predicted_value2:
print "The Flash Tv Show will have more viewers for next week"
else:
print "Arrow Tv Show will have more viewers for next week"
x1,y1,x2,y2 = get_data('C:\\Users\\SHIVAPRASAD\\Desktop\\test.csv')
more_viewers(x1,y1,x2,y2)
答案 0 :(得分:0)
不,您的数据不是7
值的集合,它有9
行:
+----------------+-------------------+----------------+------------------+
| FLASH_EPISODE | FLASH_US_VIEWERS | ARROW_EPISODE | ARROW_US_VIEWERS |
+----------------+-------------------+----------------+------------------+
| 1 | 4.83 | 1 | 2.84 |
| 2 | 4.27 | 2 | 2.32 |
| 3 | 3.59 | 3 | 2.55 |
| 4 | 3.53 | 4 | 2.49 |
| 5 | 3.46 | 5 | 2.73 |
| 6 | 3.73 | 6 | 2.6 |
| 7 | 3.47 | 7 | 2.64 |
| 8 | 4.34 | 8 | 3.92 |
| 9 | 4.66 | 9 | 3.06 |
+----------------+-------------------+----------------+------------------+
(因为您的代码来自Dataconomy Linear Regression Implementation in Python。)
命令中的值9
predicted_value1 = regr1.predict(9)
没问题。