Python,如何从另一个文本文件中的文本中删除文本?

时间:2017-05-14 22:56:06

标签: python file text

有点难以解释,但我在脚本中有一个文本文件,其中包含一堆字符。我还将拥有另一个主记录。我想获取第一个文件并删除与主记录匹配的所有内容。某些条目最终不会出现在第一个文件中。这是一个例子:

第一档:

Cow 
Duck
Sheep

主记录:

Duck
Sheep 
Cat
Dog

感谢任何帮助!

3 个答案:

答案 0 :(得分:0)

通过主文件读取并将行放入集合中,然后将第二个文件中的行与主集中的单词进行比较:

<强>代码:

# read in the master file and put each line into a set
with open('master') as f:
    master = {w.strip() for w in f.readlines()}

# read through the second file and keep each line not in master
with open('file1') as f:
    allowed = [w.strip() for w in f.readlines() if w.strip() not in master]

# show the allowed lines
for w in allowed:
    print(w)

答案 1 :(得分:0)

试试这个(假设你的列表都是文件):

master = open('master.txt', 'r').read()
f = open('file.txt', 'r').read()
f_arr = f.split('\n')
master_arr = master.split('\n')
fin_arr = []
for i in range(len(f_arr)):
    if not f_arr[i] in master_arr:
         fin_arr.append(f_arr[i])
final = '\n'.join(fin_arr)

答案 2 :(得分:0)

请注意,这不包括文件读/写。

数据:

file = """
cow
duck
sheep
"""

master_record = """
duck
sheep
cat
dog
"""

现在对于单线列表理解,没有人想看:

print([i for i in [x for x in file.replace('\n', ' ').split(' ') if x in master_record.replace('\n', ' ').split(' ')] if i])

这将返回文件中同时出现在主记录中的所有单词的列表。

拆分:

found = []

# Loop through ever word in `file`, replacing newlines with spaces,
for word in file.replace('\n', ' ').split(' '):
    # Check if the word is in the master file,
    if word in master_record.replace('\n', ' ').split(' '):
        # Make sure the word contains something,
        if word:
            # Add this word to found,
            found += [word]

# Print what we found,
print(found)

希望这有帮助!

-Coolq