接收ValueError:x和y的大小必须相同。任何帮助,将不胜感激

时间:2019-04-29 23:19:15

标签: python python-3.x machine-learning scikit-learn linear-regression

当前正在研究机器学习问题以预测天气。但是在这里,当我在Jupyter笔记本中运行代码时,遇到了上述错误,并且我不确定在哪里出错,因为我的数据值都应在2d数组中。任何帮助将不胜感激。在我的笔记本中,它特别提到了133行

        axes[row, col]. scatter(df2[feature], df2['meantempm'])

作为问题。如果有帮助,我将https://stackabuse.com/using-machine-learning-to-predict-the-weather-part-2/用作此方面的痛苦资源

import jupyter
import IPython
from IPython import get_ipython
from datetime import datetime
from datetime import timedelta
import time
from collections import namedtuple
import pandas as pd
import requests
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, median_absolute_error
from sklearn.metrics import explained_variance_score, \
    mean_absolute_error, \
    median_absolute_error
import tensorflow as tf

df = pd.read_csv('end-part2_df.csv').set_index('date')

df.corr()[['meantempm']].sort_values('meantempm')

predictors = ['meantempm_1',  'meantempm_2',  'meantempm_3',
              'mintempm_1',   'mintempm_2',   'mintempm_2',
              'meandewptm_1', 'meandewptm_2', 'meandewptm_3',
              'maxdewptm_1',  'maxdewptm_2',  'maxdewptm_3',
              'mindewptm_1',  'mindewptm_2',  'mindewptm_3',
              'maxtempm_1',   'maxtempm_2',   'maxtempm_3']

df2 = df[['meantempm'] + predictors]

get_ipython().run_line_magic('matplotlib','inline')

plt.rcParams['figure.figsize'] = [16, 22]

fig, axes = plt.subplots(nrows=6, ncols=3, sharey=True)

arr = np.array(predictors).reshape(6, 3)

for row, col_arr in enumerate(arr):
    for col, feature in enumerate(col_arr):
        axes[row, col]. scatter(df2[feature], df2['meantempm'])
        if col == 0:
            axes[row, col].set(xlabel=feature, ylabel='meantempm')
        else:
            axes[row, col].set(xlabel=feature)
plt.show()

1 个答案:

答案 0 :(得分:0)

您的df ['mintempm_2']为2D(997,2)。这是因为在您的 predictors 数组中,您两次包含“ mintempm_2”。