如何训练朴素贝叶斯分类器只有1级

时间:2017-12-23 08:38:07

标签: python-3.x nltk naivebayes

假设我想知道一个人有多少宗教信仰,而不是将他归类为religiousnon-religious我想知道他有宗教信仰的可能性。所以,我使用Naive Bayes工具包创建了一个简单的nltk分类器。但它似乎不起作用,我得到两个测试样本的概率为100%。

  

训练集大小= 5

     

1.0

     

1.0

import nltk.classify.util
from nltk.classify import NaiveBayesClassifier
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

dataPos = [('god jesus god church'), ('buddha enlightenment love god'),
        ('jesus love krishna'), ('Hare krishna Hare Krishna love'),
        ('god jesus church love')]

def create_word_features(words):
    useful_words = [word for word in words if word not in stopwords.words("english")]
    my_dict = dict([(word, True) for word in useful_words])
    return my_dict

pos_views = []
for item in dataPos:
    words = item.split(' ')
    pos_views.append((create_word_features(words), "positive"))

train_set = pos_views[:]

print( 'Train Set Size = %d' %(len(train_set)) )

# Train
classifier = NaiveBayesClassifier.train(train_set)

# Testing Sample 1
person1 = '''
Love god krishna jesus
'''
words = word_tokenize(person1)
words = create_word_features(words)
prob_dist = classifier.prob_classify(words)
print(prob_dist.prob("positive"))

# Testing Sample 2
person2 = '''
I hate god hate
'''
words = word_tokenize(person2)
words = create_word_features(words)
prob_dist = classifier.prob_classify(words)
print(prob_dist.prob("positive"))

我认为问题在于只有一个班级,但我想训练分类器的方式可以让任何人告诉我如何解决这个问题。

0 个答案:

没有答案