机器学习垃圾邮件和火腿实施

时间:2016-06-20 17:31:39

标签: machine-learning scikit-learn

我最近开始了机器学习教程,第一个教程是监督学习(垃圾邮件和火腿),我从实施它开始。

my implementation:

---------total spam count-------------
hi free offers for you and the ! ....
5   3     9     4   4   6   8  6

---------total ham count-------------
hi free offers for you and the ! ....
3    5    3     7   3   4   6  2



mail_1 : hi! how are you here are some free offers for you !!! 

hi how are you here are some free offers for you !!! 
1  1   2   1    1   2    1    1     1    1    1   4


s[T] = c_spam(T) / ( c_spam(T) + c_ham(T) )

s[T] = how spammy is the word T
c_spam(T) = how many spam messages contain the word T
c_ham(T) = how many non-spam message contain the word T

现在我有两个问题:

1)这种实施是否正确?

2)现在在这台机器的结果之后,如果我发现新邮件是垃圾邮件,那么我是否需要更新旧的垃圾邮件模型?

0 个答案:

没有答案