使用sklearn python进行多线性预测

时间:2015-04-23 12:32:08

标签: python matplotlib

我想用sklearn进行多元线性回归。我有3个功能(年,月,日),我想预测kwh。 Data1是一个DataFrame。这有什么问题?

dt = data1.drop(['kwh'], axis=1)
dt
                       year     month   day
date            
2012-04-12 14:56:50     2012    4   12
2012-04-12 15:11:55     2012    4   12
2012-04-12 15:27:01     2012    4   12
2012-04-12 15:42:06     2012    4   12
2012-04-12 15:57:10     2012    4   12
2012-04-12 16:12:10     2012    4   12
2012-04-12 16:27:14     2012    4   12
2012-04-12 16:42:19     2012    4   12
2012-04-12 16:57:24     2012    4   12
2012-04-12 17:12:28     2012    4   12
2012-04-12 17:27:33     2012    4   12
2012-04-12 17:42:37     2012    4   12
2012-04-12 17:57:41     2012    4   12
2012-04-12 18:12:44     2012    4   12
2012-04-12 18:27:46         4   12
2012-04-12 18:42:51     2012    4   12
2012-04-12 18:57:54     2012    4   12
2012-04-12 19:12:58     2012    4   12
2012-04-12 19:28:01     2012    4   12
2012-04-12 19:43:04     2012    4   12
2012-04-12 19:58:07     2012    4   12
2012-04-12 20:13:10     2012    4   12
2012-04-12 20:28:15     2012    4   12
2012-04-12 20:43:15     2012    4   12
2012-04-12 20:58:18     2012    4   12
2012-04-12 21:13:20     2012    4   12
2012-04-12 21:28:22     2012    4   12
2012-04-12 21:43:24     2012    4   12
2012-04-12 21:58:27     2012    4   12
2012-04-12 22:13:29     2012    4   12
2012-04-12 22:28:34     2012    4   12
2012-04-12 22:43:38     2012    4   12
2012-04-12 22:58:43     2012    4   12
2012-04-12 23:13:43     2012    4   12
2012-04-12 23:28:46     2012    4   12
2012-04-12 23:43:55     2012    4   12
2012-04-12 23:59:00     2012    4   12
2012-04-13 00:14:02     2012    4   13
2012-04-13 00:29:05     2012    4   13
2012-04-13 00:44:09     2012    4   13
2012-04-13 00:59:09     2012    4   13
2012-04-13 01:14:10     2012    4   13
2012-04-13 01:29:11     2012    4   13
2012-04-13 01:44:16     2012    4   13
2012-04-13 01:59:22     2012    4   13
2012-04-13 02:14:21     2012    4   13
2012-04-13 02:29:24     2012    4   13
2012-04-13 02:44:24     2012    4   13
2012-04-13 02:59:25     2012    4   13
2012-04-13 03:14:30     2012    4   13
2012-04-13 03:29:31     2012    4   13
2012-04-13 03:44:31     2012    4   13
2012-04-13 03:59:42     2012    4   13
2012-04-13 04:14:43     2012    4   13
2012-04-13 04:29:43     2012    4   13
2012-04-13 04:44:46     2012    4   13
2012-04-13 04:59:47     2012    4   13
2012-04-13 05:14:48     2012    4   13
2012-04-13 05:29:49     2012    4   13
2012-04-13 05:44:50     2012    4   13
   ...  ...     ...

65701 rows × 3 columns
x_train, x_test, y_train, y_test = train_test_split(dt, data1['kwh'], test_size=0.4)
clf = LinearRegression()
clf.fit(x_train, y_train)
plt.scatter(x_test, y_test)
plt.plot(x_test, clf.predict(x_test), color='blue',
     linewidth=3)
plt.show()

这是错误:

ValueError                                Traceback (most recent call last)
<ipython-input-97-a4b702fcee3d> in <module>()
----> 1 plt.scatter(x_test, y_test)
      2 plt.plot(x_test, clf.predict(x_test), color='blue',
      3          linewidth=3)
      4 plt.show()

/usr/lib/pymodules/python2.7/matplotlib/pyplot.pyc in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, hold, **kwargs)
 3085         ret = ax.scatter(x, y, s=s, c=c, marker=marker, cmap=cmap, norm=norm,
   3086                          vmin=vmin, vmax=vmax, alpha=alpha,
-> 3087                          linewidths=linewidths, verts=verts, **kwargs)
   3088         draw_if_interactive()
   3089     finally:

/usr/lib/pymodules/python2.7/matplotlib/axes.pyc in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, **kwargs)
  6254         y = np.ma.ravel(y)
  6255         if x.size != y.size:
-> 6256             raise ValueError("x and y must be the same size")
  6257 
  6258         s = np.ma.ravel(s)  # This doesn't have to match x, y in size.

ValueError: x and y must be the same size

0 个答案:

没有答案