我尝试导入为:
from nltk.corpus import PunktSentenceTokenizer
这给了我以下错误:
ImportError: cannot import name 'PunktSentenceTokenizer'
现在我尝试将其导入为:
tokenizer = nltk.tokenize.punkt.PunktSentenceTokenizer()
但下面的代码再次出现错误:
tagged_sentences = nltk.corpus.treebank.tagged_sents()
cutoff = int(.75 * len(tagged_sentences))
training_sentences = DataPrep.train_news['Statement']
print(training_sentences)
custom_sent_tokenizer =tokenizer.tokenize(training_sentences)
tokenized=custom_sent_tokenizer
error is: #TypeError: expected string or bytes-like object