如何从countvectorizer中删除Stopwords以外的单词

时间:2017-12-12 10:44:42

标签: scikit-learn nlp

我想只保留我作为停用词传递给计数器的单词。

CV= CountVectorizer(max_features=500,stop_words= frozenset(["word1", "word2","word3"]))

如何做到这一点。

1 个答案:

答案 0 :(得分:3)

IIUC您要使用vocabulary parameter代替stop_words

CV = CountVectorizer(max_features=500, vocabulary=["word1","word2","word3"])