我有一个txt文件,其中包含一篇新闻文章(我认为它存储为列表),我想对这些单词进行标记并标记它们并将它们保存到各自的文件中。
我使用nltk库运行以下内容。
由于某种原因,代码运行但文件为空。如果我只跑
with open(news_file) as f1, open(token_file, "w") as f2, open(tagged_file, "w") as f3:
f2.writelines(('\n'.join(wt(words)) for words in f1.readlines()))
然后新文件将在新行列出新闻文章的每个单词
使用以下代码我遇到了tokenized = ' '.join(wt(tagged))
的问题,导致错误TypeError: expected string or bytes-like object
。我也试过str.join
,但无济于事
with open(news_file) as f1, open(token_file, "w") as f2, open(tagged_file, "w") as f3:
tagged = pos_tag(f1.readlines())
tokenized = ' '.join(word_tokenize(tagged))
for token_words in tokenized:
print(' '.join(token_words), file=f2)
for tag_words in tagged:
print(' '.join(tag_words), file=f3)
#f2.writelines(('\n'.join(wt(words)) for words in f1.readlines()))
任何帮助将不胜感激。
谢谢:)