我正在根据获胜方做出预测。我选择的列是候选人集和候选人集的投票,如数据集中一样。我的代码如下:-
# Loading and cleaning dataset
df4 = pd.read_csv('Election-Results-2018 - Parlimen_Results_By_Candidate.csv')
df4['Votes for Candidate'] = df4['Votes for Candidate'].str.replace(',','').astype(float)
df4['Total Votes Cast'] = df4['Total Votes Cast'].str.replace(',','').astype(float)
df4['% of total Votes'] = df4['% of total Votes'].str.replace('%','').astype(float)
# Step 1 - import the model
from sklearn.linear_model import LogisticRegression
# Step 2 - Define your training data
columns = ['Candidate Party', 'Votes for Candidate']
# Step 3 - create training dataset
X = df[columns]
y = df['New Results']*
运行这些代码后,我收到如下错误:-
KeyError: "None of [Index(['Candidate Party', 'Votes for Candidate'], dtype='object')] are in the [columns]"
我是机器学习的初学者,希望能得到任何人的帮助和指导。 TQ
答案 0 :(得分:0)
这是一个简单的错误,您使用了错误的名称df
而不是df4
,这应该有效:
df4 = pd.read_csv('Election-Results-2018 - Parlimen_Results_By_Candidate.csv')
df4['Votes for Candidate'] = df4['Votes for Candidate'].str.replace(',','').astype(float)
df4['Total Votes Cast'] = df4['Total Votes Cast'].str.replace(',','').astype(float)
df4['% of total Votes'] = df4['% of total Votes'].str.replace('%','').astype(float)
# Step 1 - import the model
from sklearn.linear_model import LogisticRegression
# Step 2 - Define your training data
columns = ['Candidate Party', 'Votes for Candidate']
# Step 3 - create training dataset
X = df4[columns]
y = df4['New Results']