我使用了朴素贝叶斯分类器,但是现在我想使用SVM分类器,怎么办?

时间:2019-08-18 20:00:01

标签: python nlp nltk svm

我将文本分为2类。一种是命令性的,另一种是非命令性的。我以朴素贝叶斯分类器需要的方式准备了文本。但是,现在,我还需要使用SVM。我该怎么办? (我也需要对文本进行分类并计算准确性。)感谢您阅读并尝试回答我的问题。

all_words_list = [word for (sent, cat) in train for word in sent]
all_words = nltk.FreqDist(all_words_list)
word_items = all_words.most_common(1000)
word_features = [word for (word, count) in word_items]

def document_features(document, word_features):
    document_words = set(document)
    features = {}
    for word in word_features:
        features['contains({})'.format(word)] = (word in document_words)
    return features

 featuresets = [(document_features(d, word_features), c) for (d, c) in 
 train]

train_set, test_set = featuresets[360:], featuresets[:360]
classifier = nltk.NaiveBayesClassifier.train(train_set)
print (nltk.classify.accuracy(classifier, test_set))

1 个答案:

答案 0 :(得分:1)

我建议先将您的数据集划分为训练并正确测试

X包含功能变量,Y包含响应变量,我们将其分成70%-30%

X_train, X_test, y_train, y_test = train_test_split(X, Y, random_state=101,test_size=0.3)

from sklearn import svm
from sklearn import metrics
#on sklearn docs you can find more about SVM parameters
model = svm.SVC(kernel='rbf',C=10000.0,gamma = 'auto')
model = model.fit(X_train, y_train)
print('Accuracy is ', round(metrics.accuracy_score(model.predict(X_test),y_test),2))