Question

我正在根据获胜方做出预测。我选择的列是候选人集和候选人集的投票，如数据集中一样。我的代码如下：-

# Loading and cleaning dataset
df4 = pd.read_csv('Election-Results-2018 - Parlimen_Results_By_Candidate.csv')
df4['Votes for Candidate'] = df4['Votes for Candidate'].str.replace(',','').astype(float)
df4['Total Votes Cast'] = df4['Total Votes Cast'].str.replace(',','').astype(float)
df4['% of total Votes'] = df4['% of total Votes'].str.replace('%','').astype(float)

# Step 1 - import the model 
from sklearn.linear_model import LogisticRegression

# Step 2 - Define your training data
columns = ['Candidate Party', 'Votes for Candidate']

# Step 3 - create training dataset
X = df[columns]
y = df['New Results']*

运行这些代码后，我收到如下错误：-

KeyError: "None of [Index(['Candidate Party', 'Votes for Candidate'], dtype='object')] are in the [columns]"

我是机器学习的初学者，希望能得到任何人的帮助和指导。 TQ

Answer 1

这是一个简单的错误，您使用了错误的名称df而不是df4，这应该有效：

df4 = pd.read_csv('Election-Results-2018 - Parlimen_Results_By_Candidate.csv')
df4['Votes for Candidate'] = df4['Votes for Candidate'].str.replace(',','').astype(float)
df4['Total Votes Cast'] = df4['Total Votes Cast'].str.replace(',','').astype(float)
df4['% of total Votes'] = df4['% of total Votes'].str.replace('%','').astype(float)

# Step 1 - import the model 
from sklearn.linear_model import LogisticRegression

# Step 2 - Define your training data
columns = ['Candidate Party', 'Votes for Candidate']

# Step 3 - create training dataset
X = df4[columns]
y = df4['New Results']

预测模型：Logistic回归模型，流行的分类模型

1 个答案: