错误如上所述。我认为这可能与我的get_dummies函数有关,但是由于我对此感到非常陌生,因此我不确定。非常感谢我对愚蠢的新手的帮助。
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier
from sklearn import tree
df = pd.read_csv("D:/Machine Learning/Kaggle/Loan Prediction/train.csv")
df = df.dropna()
print(df.isnull().sum())
train, test = train_test_split(df, test_size=0.3, random_state=0)
xTrain = train.drop('Loan_Status', axis=1)
yTrain = train['Loan_Status']
xTest = test.drop('Loan_Status', axis=1)
yTest = test['Loan_Status']
xTrain = pd.get_dummies(xTrain)
xTest = pd.get_dummies(xTest)
model = BaggingClassifier(tree.DecisionTreeClassifier(random_state=1))
model.fit(xTrain,yTrain)
score = model.score(xTest,yTest)
print(score)
答案 0 :(得分:0)
针对您问题的一种可能的解决方案是在分手训练和测试之前先弄傻瓜:
df = pd.read_csv("D:/Machine Learning/Kaggle/Loan Prediction/train.csv")
df = df.dropna()
df_X = df.drop('Loan_Status', axis=1)
df_X = pd.get_dummies(df_X)
df_y = df['Loan_Status']
train_X, test_X, train_y, test_y = train_test_split(df_X, df_y, test_size=0.3, random_state=0)
model = BaggingClassifier(tree.DecisionTreeClassifier(random_state=1))
model.fit(train_X,train_y)
score = model.score(test_X, test_y)