ValueError:x和y必须具有相同的第一尺寸,但形状为(4200,)和(16800,1)

时间:2018-06-26 07:56:36

标签: python scikit-learn svm

我已经使用SCIKIT-LEARN创建了SVR模型,我试图绘制数据,但是由于某些原因,我收到了错误消息:

  

ValueError:x和y必须具有相同的第一尺寸,但形状   (4200,)和(16800,1)

我已将数据分为训练和测试数据,训练模型并进行了预测。我的代码是:

X_feature = wind_speed

X_feature = X_feature.reshape(-1, 1)## Reshaping array to be 1D from 2D

y_label = Power
y_label = y_label.reshape(-1,1)

    timeseries_split = TimeSeriesSplit(n_splits=3) ## Splitting training testing data into 3 splits
    for train_index, test_index in timeseries_split.split(X_feature):## for loop to obtain print the training and splitting of the data 
    print("Training data:",train_index, "Testing data test:", test_index)#
    X_train, X_test = X_feature[train_index], X_feature[test_index]
    y_train, y_test = y_label[train_index], y_label [test_index]



    timeseries_split = TimeSeriesSplit(n_splits=3) ## Splitting training testing data into 3 splits






    scaler =pre.MinMaxScaler(feature_range=(0,1)).fit(X_train)## Data is being preprocessed then standard deviation 


    scaled_wind_speed_train = scaler.transform(X_train)## Wind speed training data is being scaled and then transformed 

    scaled_wind_speed_test = scaler.transform(X_test)## Wind speed test data is being scaled and then transformed

    SVR_model = svm.SVR(kernel='rbf',C=100,gamma=.001).fit(scaled_wind_speed_train,y_train)



    y_prediction = SVR_model.predict(scaled_wind_speed_test)

    SVR_model.score(scaled_wind_speed_test,y_test)


    rmse=numpy.sqrt(mean_squared_error(y_label,y_prediction))
    print("RMSE:",rmse)


    fig, bx = plt.subplots(figsize=(19,8))
    bx.plot(y_prediction, X_feature,'bs')
    fig.suptitle('Wind Power Prediction v Wind Speed', fontsize=20)
    plt.xlabel('Wind Power Data')
    plt.ylabel('Predicted Power')
    plt.xticks(rotation=30)
    plt.show() 


     fig, bx = plt.subplots(figsize=(19,8))
     bx.plot( y_prediction, y_label)
     fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
     plt.xlabel('Wind Power Data')
     plt.ylabel('Predicted Power')


     fig, bx = plt.subplots(figsize=(19,8))
     bx.plot(y_prediction)
     fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
     plt.xlabel('Wind Power Data')
     plt.ylabel('Predicted Power')

我认为当我尝试在行中获取rmse时,正在生成此代码:

rmse=numpy.sqrt(mean_squared_error(y_label,y_prediction))

当我注释掉该行并尝试绘制数据时,也会发生此错误。

  

我的回溯错误消息是:


ValueError                                Traceback (most recent call last)
<ipython-input-57-ed11a9ca7fd8> in <module>()
     79 
     80     fig, bx = plt.subplots(figsize=(19,8))
---> 81     bx.plot( y_prediction, y_label)
     82     fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
     83     plt.xlabel('Wind Power Data')

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1715                     warnings.warn(msg % (label_namer, func.__name__),
   1716                                   RuntimeWarning, stacklevel=2)
-> 1717             return func(ax, *args, **kwargs)
   1718         pre_doc = inner.__doc__
   1719         if pre_doc is None:

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_axes.py in plot(self, *args, **kwargs)
   1370         kwargs = cbook.normalize_kwargs(kwargs, _alias_map)
   1371 
-> 1372         for line in self._get_lines(*args, **kwargs):
   1373             self.add_line(line)
   1374             lines.append(line)

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _grab_next_args(self, *args, **kwargs)
    402                 this += args[0],
    403                 args = args[1:]
--> 404             for seg in self._plot_args(this, kwargs):
    405                 yield seg
    406 

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs)
    382             x, y = index_of(tup[-1])
    383 
--> 384         x, y = self._xy_from_xy(x, y)
    385 
    386         if self.command == 'plot':

~/anaconda3_501/lib/python3.6/site-packages/matplotlib/axes/_base.py in _xy_from_xy(self, x, y)
    241         if x.shape[0] != y.shape[0]:
    242             raise ValueError("x and y must have same first dimension, but "
--> 243                              "have shapes {} and {}".format(x.shape, y.shape))
    244         if x.ndim > 2 or y.ndim > 2:
    245             raise ValueError("x and y can be no greater than 2-D, but have "

ValueError: x and y must have same first dimension, but have shapes (4200,) and (16800, 1)

2 个答案:

答案 0 :(得分:2)

我认为您对mean_squared_error的争论不一,应该是

rmse=numpy.sqrt(mean_squared_error(y_test,y_prediction))

更新:根据最新错误,请尝试

fig, bx = plt.subplots(figsize=(19,8))
bx.plot(y_prediction, scaled_wind_speed_test,'bs')
fig.suptitle('Wind Power Prediction v Wind Speed', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')
plt.xticks(rotation=30)
plt.show() 

更新2 如果您在其他情节上遇到错误,请尝试

fig, bx = plt.subplots(figsize=(19,8))
bx.plot( y_prediction, y_test)
fig.suptitle('Wind Power Prediction v Measured Wind Power ', fontsize=20)
plt.xlabel('Wind Power Data')
plt.ylabel('Predicted Power')

答案 1 :(得分:0)

Numpy函数mean_squared_error期望两个大小相同的数组。 您得到的错误暗示这两个大小不相同。

您可以通过以下方式检查阵列大小

print(array_1.shape)
print(array_2.shape)

如果您得到的输出是

output:
> (4200,)
> (4200, 1)

您可以通过修改

new_array_2 = array_2.transpose()[0]

然后

mean_squared_error(array_1, new_array_2)

如果有两个输入参数,则无论它们是什么,都会给您以下形状

print(array_1.shape)
print(array_2.shape)

output:
> (4200,)
> (16800, 1)

尝试

new_array_1 = scalar.transform(array_1)

new_array_2 = scalar.transform(array_2)

直到获得相同编号的数组,无论是16800还是4200。 一旦您拥有两个相同的尺寸,但是其中一个或两个都仍然具有额外的尺寸,

然后再做

new_new_array_1 = scalar.transform(new_array_1)[0]

并将其提供给mean_squared_error,例如

mean_squared_error(new_new_array_1, new_array_2)