Question

我正在尝试打开一个文本文件，删除其后面有某个单词的某些单词，然后将新内容写入新文件。使用以下代码，new_content包含我需要的内容，并创建一个新文件，但它是空的。我无法弄清楚为什么。我尝试过不同的缩进并传入一种编码类型，没有运气。非常感谢任何帮助。

import glob
import os
import nltk, re, pprint
from nltk import word_tokenize, sent_tokenize
import pandas
import string
import collections

path = "/pathtofiles"

for file in glob.glob(os.path.join(path, '*.txt')):
    if file.endswith(".txt"):
        f = open(file, 'r')
        flines = f.readlines()
        for line in flines: 
            content = line.split() 

            for word in content:
                if word.endswith(']'):
                    content.remove(word)

            new_content = ' '.join(content)

            f2 = open((file.rsplit( ".", 1 )[ 0 ] ) + "_preprocessed.txt", "w")
            f2.write(new_content)
            f.close

Answer 1

这应该适用于@firefly。如果你有问题，很乐意回答问题。

import glob
import os

path = "/pathtofiles"

for file in glob.glob(os.path.join(path, '*.txt')):
    if file.endswith(".txt"):
        with open(file, 'r') as f:
            flines = f.readlines()
            new_content = []
            for line in flines: 
                content = line.split() 

                new_content_line = []

                for word in content:
                    if not word.endswith(']'):
                        new_content_line.append(word)

                new_content.append(' '.join(new_content_line))

            f2 = open((file.rsplit( ".", 1 )[ 0 ] ) + "_preprocessed.txt", "w")
            f2.write('\n'.join(new_content))
            f.close
            f2.close

如何打开文件，转换字符串并写入新文件

1 个答案: