我有以下数值:
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
model = GaussianNB()
d = {'Pos': [1,2,3,4,5,6,7,8,9,10], 'Neg': [10,9,8,7,6,5,4,3,2,1], 'Res': ['win','win','win','win','draw','loss','loss','loss','loss','loss',]}
df = pd.DataFrame(d)
然后我尝试实现以下简单的Naive Bayes分类
train, test = train_test_split(df,test_size=0.2)
train_data = (train.Pos.values, train.Neg.values)
train_target = train.Res.values
model.fit(train_data, train_target)
但是我一直收到以下错误:
Found input variables with inconsistent numbers of samples: [2, 8]
我已经进行了实验,似乎不是读取两个数组的值,而是读取了多少个数组(train.Pos.values,train.Neg.Values);这可能导致问题。
为什么会这样?以及如何更改我的代码以解决此问题?
答案 0 :(得分:2)
使用
train, test = train_test_split(df,test_size=0.2)
train_data = train[['Pos', 'Neg']]
train_target = train['Res']
答案 1 :(得分:0)
您正在从数据框中创建一个numpy数组元组。您需要两列中的单个2D数组。
train, test = train_test_split(df, test_size=0.2)
train_data = train.values[:, :2]
train_target = train.Res.values