我正在尝试将朴素高斯拟合到数据集中。以下是代码:
def some_callback(result: asyncio.Future):
print(result)
这就是数据的样子
import pandas as pd
import numpy as np
from sklearn.naive_bayes import GaussianNB
df = pd.read_csv('train_data.csv')
X = df.iloc[:,0:23]
X
Y = df.iloc[:,24:25]
clf = GaussianNB()
clf.fit(X, Y)
default_next_month是目标变量。这是二元分类问题。 Y包含最后一列。但它给出了这个错误:
LIMIT_BAL SEX EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 PAY_5 ... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6 default_next_month
0 20000 2 2 1 24 2 2 -1 -1 -2 ... 0 0 0 0 689 0 0 0 0 1
1 120000 2 2 2 26 -1 2 0 0 0 ... 3272 3455 3261 0 1000 1000 1000 0 2000 1
2 90000 2 2 2 34 0 0 0 0 0 ... 14331 14948 15549 1518 1500 1000 1000 1000 5000 0
3 50000 2 2 1 37 0 0 0 0 0 ... 28314 28959 29547 2000 2019 1200 1100 1069 1000 0
4 50000 1 2 1 57 -1 0 -1 0 0 ... 20940 19146 19131 2000 36681 10000 9000 689 679
答案 0 :(得分:1)
我所要做的只是将声明改为
Y = df.iloc[:,-1]
并且有效