朴素贝叶斯算法始终为0

时间:2016-04-17 20:58:50

标签: artificial-intelligence classification probability text-classification naivebayes

我对NaïveBayes分类方法有疑问。 我跑了,虽然我认为这是一个简单的例子,但遇到了障碍。

基本上这是我想要做的分类:

我希望能够获取一些培训数据:

input1 | input2 |  input3 | class
  1        3         3        1
  2        1         1        2
  1        1         1        3
  3        3         3        1

并将它们分类为1-3级。

据我所知,你先计算先验概率 在这种情况下,类将是

class 1 = P(c_1) = 0.50
class 2 = P(c_2) = 0.25
class 3 = P(c_3) = 0.25

因此非常有意义。他们都加1和它的 很容易看出这些数字的来源。

因此,由于这些值的数字性质,我想简化 他们进入范围。所以我将我的数据重建为:

所以无论如何我到达那张桌子。继续贝叶斯部分:

P(Class 1 | avg_speed_1): 0.5
P(Class 1 | avg_speed_2): 0
P(Class 1 | avg_speed_3): 0
P(Class 2 | avg_speed_1): 0
P(Class 2 | avg_speed_2): 0.25
P(Class 2 | avg_speed_3): 0
P(Class 3 | avg_speed_1): 0
P(Class 3 | avg_speed_2): 0
P(Class 3 | avg_speed_3): 0.25
P(Class 1 | avg_distance_1): 0.5
P(Class 1 | avg_distance_2): 0
P(Class 1 | avg_distance_3): 0
P(Class 2 | avg_distance_1): 0
P(Class 2 | avg_distance_2): 0.25
P(Class 2 | avg_distance_3): 0
P(Class 3 | avg_distance_1): 0
P(Class 3 | avg_distance_2): 0
P(Class 3 | avg_distance_3): 0.25
P(Class 1 | avg_elev_gain_1): 0.5
P(Class 1 | avg_elev_gain_2): 0
P(Class 1 | avg_elev_gain_3): 0
P(Class 2 | avg_elev_gain_1): 0
P(Class 2 | avg_elev_gain_2): 0
P(Class 2 | avg_elev_gain_3): 0
P(Class 3 | avg_elev_gain_1): 0
P(Class 3 | avg_elev_gain_2): 0
P(Class 3 | avg_elev_gain_3): 0.5

现在这一切对我来说仍然有意义。然而,每个班级仍然增加到1 当我去计算每个班级的概率时,0会搞砸计算

以第一堂课为例:

P(Class 1 | avg_speed_1) *
P(Class 1 | avg_speed_2) *
P(Class 1 | avg_speed_3) *
P(Class 1 | avg_distance_1) *
P(Class 1 | avg_distance_2) *
P(Class 1 | avg_distance_3) *
P(Class 1 | avg_elev_gain_1) *
P(Class 1 | avg_elev_gain_2) *
P(Class 1 | avg_elev_gain_3) *
P(Class 1) = 0

我发现这总是等于零,因为有很多 输入元素仍为零!我哪里做错了?!?这是否意味着我的训练数据不足?

话虽如此,NaïveBayes方法甚至是接近这种分类的正确方法?

任何想法都将不胜感激

0 个答案:

没有答案