Question

希望您可以帮助我。我正在尝试处理保存在.txt文件中的数据；因此，我尝试将其标记化（句子和单词），并使用从网上获得的管道方法添加POS标签。我认为我已经设法输入了文件，但是我努力生成一个outpufile以在被标记化之后``写''一个类似对象的文件等。我尝试了几件事，但是我认为我没有掌握任务的复杂性。非常感谢，

with open('outnovelaTerror', 'w') as wtexts:
    for line in texts:
        wtexts.write(line)

import nltk, re, pprint
from nltk import word_tokenize
from nltk.probability import FreqDist
from nltk.tokenize import sent_tokenize, word_tokenize




def source(texts, targets): # I used this to import the file
    with open('novelaTerror.txt', 'r') as texts:
        for text in texts:
            for t in targets:
                t.send(text)



def sent_tokenize_pipeline(targets):
    while True:
        text = (yield)
        sentences = nltk.sent_tokenize(text)
        for sentence in sentences:
            for target in targets:
                target.send(sentence


def word_tokenize_pipeline(targets):
    while True:
        sentence = (yield)
        words = nltk.word_tokenize(sentence)
        for target in targets:
            target.send(words)

with open('outnovelaTerror', 'w') as wtexts: ## I tried this to save the file in my working directory##
    for line in texts:
        wtexts.write(line)

如何编写输出文件之类的对象

0 个答案: