Question

我写了一些代码来练习机器学习。但是我有这个问题，我不明白，因为我输入的是quandl表中的各列。

这是我的代码：

import pandas as pd
import math 
import quandl 
import numpy as np 
from sklearn import preprocessing, svm, model_selection #preproceesing is used to do some cleaning or scalin of data prior to machine learning 
from sklearn.model_selection import train_test_split, cross_validate
from sklearn.linear_model import LinearRegression 

df=quandl.get("EOD/NKE", authtoken="jcfsm6-47Pe1hgxDqjDU")

df=df[['ADJ_OPEN','ADJ_HIGH','ADJ_LOW','ADJ_CLOSE','ADJ_VOLUME']]
df['HL_PCT']=(df['ADJ_HIGH'] -df['ADJ_LOW'])/ df['ADJ_CLOSE']*100.0
df['PCT_Change']=(df['ADJ_CLOSE']-df['ADJ_OPEN'])/df['ADJ_OPEN']*100.0
df=df[['ADJ_CLOSE','HL_PCT','PCT_Change','ADJ_VOLUME']]

print(df.head())

forecast_col='ADJ_CLOSE'
df.fillna(value=-99999, inplace=True)
forecast_out=int(math.ceil(0.01*len(df)))
df['label']=df[forecast_col].shift(-forecast_out)
df.dropna(inplace=True) #NaN in short term is Not a Number 

#In typical standard in machine learning, X is used to name the features, and y is used to name the label. 

X=np.array(df.drop(['label'],1))
y=np.array(df['label'])

X=preprocessing.scale(X)
y=np.array(df['label'])

#When training, take around 75% of your data to train, adn 25% to let the module predict. 
X_train, y_train, X_test, y_test=train_test_split(X,y,test_size=0.2)

# Define the classifier
clf=svm.SVR(gamma='auto')

# Train the model 
clf.fit(X_train, y_train)

# Test the model
confidence=clf.score(X_test, y_test)

print(confidence)

当我使用命令python3 my.py运行它时，这是错误消息：

KeyError: "None of [Index(['ADJ_OPEN', 'ADJ_HIGH', 'ADJ_LOW', 'ADJ_CLOSE', 'ADJ_VOLUME'], dtype='object')] are in the [columns]"

KeyError：“ [索引...]都不在[列]中

0 个答案: