我想只保留我作为停用词传递给计数器的单词。
CV= CountVectorizer(max_features=500,stop_words= frozenset(["word1", "word2","word3"]))
如何做到这一点。
答案 0 :(得分:3)
IIUC您要使用vocabulary
parameter代替stop_words
:
CV = CountVectorizer(max_features=500, vocabulary=["word1","word2","word3"])