Question

我已开始使用以下数据框实现多项式朴素贝叶斯：

   Disc Bus Dep Edu
0   1   2   2   1   
1   0   1   1   1   
2   1   2   1   4   
3   0   1   1   1   
4   0   2   1   3

我也将其分为训练/测试

X = data_rev.drop('Disc', axis = 1)

y = data_rev['Disc']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 21)

然后我开始计算概率： 1.计算每个类别的先验对数概率（使用np.unique返回数组中排序的唯一元素）

separated = [[x for x, t in zip(X_train, y_train) if t == c] for c in np.unique(y_train)]

count_sample = X_train.shape[0]

self.class_log_prior_ = [np.log(len(i) / count_sample) for i in separated]

count = np.array([np.array(i).sum(axis = 0) for i in separated])

但是以某种方式，它使我错了，说TypeError: cannot perform reduce with flexible type。显然，在传递axis = 0时，numpy检测到一个字符串（数据帧头），并且无法执行该操作。那是怎么回事？如何在count操作中解决此问题？

第3步如下：

feature_log_prob_ = np.log(count / count.sum(axis = 1)[np.newaxis].T)当然会引发错误，因为它正在调用count。