Python ValueError。不了解错误或如何解决

时间:2019-04-25 19:32:55

标签: python arrays numpy dataset

我在这里关注本教程; https://www.analyticsvidhya.com/blog/2018/10/predicting-stock-price-machine-learningnd-deep-learning-techniques-python/#comment-155692

不是使用提供的数据集,而是分配作业所需的数据集。

使用的代码是

#import packages
import pandas as pd
import numpy as np

#to plot within notebook
import matplotlib.pyplot as plt
%matplotlib inline

#setting figure size
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 20,10

#for normalizing data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))

#read the file
df = pd.read_csv('C:/Users/Usert/Downloads/stock-20050101-to-20171231/stock-20050101-to-20171231/IBM_2006-01-01_to_2018-01-01.csv')

#print the head
df.head()

#setting index as date
df['Date'] = pd.to_datetime(df.Date,format='%Y-%m-%d')
df.index = df['Date']

#plot
plt.figure(figsize=(16,8))
plt.plot(df['Close'], label='Close Price history')

#creating dataframe with date and the target variable
data = df.sort_index(ascending=True, axis=0)
new_data = pd.DataFrame(index=range(0,len(df)),columns=['Date', 'Close'])

for i in range(0,len(data)):
     new_data['Date'][i] = data['Date'][i]
     new_data['Close'][i] = data['Close'][i]

#splitting into train and validation
train = new_data[:987]
valid = new_data[987:]

new_data.shape, train.shape, valid.shape
((1235, 2), (987, 2), (248, 2))

train['Date'].min(), train['Date'].max(), valid['Date'].min(), valid['Date'].max()


#make predictions
preds = []
for i in range(0,248):
    a = train['Close'][len(train)-248+i:].sum() + sum(preds)
    b = a/248
    preds.append(b)

#calculate rmse
rms=np.sqrt(np.mean(np.power((np.array(valid['Close'])-preds),2)))
rms

#plot
valid['Predictions'] = 0
valid['Predictions'] = preds
plt.plot(train['Close'])
plt.plot(valid[['Close', 'Predictions']])

运行正常,直到遇到错误时“ #Calculate RMSE”。

File "<ipython-input-92-1256d885493e>", line 65, in <module>
rms=np.sqrt(np.mean(np.power((np.array(valid['Close'])-preds),2)))

ValueError: operands could not be broadcast together with shapes (2033,) (248,) 

按要求使用“ print(valid.shape)”和“ print(len(preds))”将返回“(604,3)”和“ 248”。

有人知道每次更改数字时如何更改数字以适合我的数据集?

仅供参考;

我使用的数据集有7列,分别是“日期,打开,高,低,关闭,成交量和名称”,其中包含标题的3021行数据。

尽管本教程中有8列,分别是“日期,开盘价,最高价,最低价,最后价,收盘价,total_trade_quantity和营业额”,其中包括标题的1236行。

0 个答案:

没有答案