Question

我想打印文本文件中存在的唯一行。

例如：如果我的文本文件的内容是：

我想要我的Python程序打印：

12474
54675
74564

我使用的是Python 2.7。

Answer 1

试试这个：

from collections import OrderedDict

seen = OrderedDict()
for line in open('file.txt'):
    line = line.strip()
    seen[line] = seen.get(line, 0) + 1

print("\n".join([k for k,v in seen.items() if v == 1]))

打印

12474
54675
74564

更新：感谢下面的评论，这甚至更好：

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
    pass

with open('file.txt') as f:
    seen = OrderedCounter([line.strip() for line in f])
    print("\n".join([k for k,v in seen.items() if v == 1]))

Answer 2

使用index()检查列表中每个元素的出现次数，并使用for循环中的with open("file.txt","r")as f: data=f.readlines() for x in data: if data.count(x)>1: #if item is a duplicate for i in range(data.count(x)): data.pop(data.index(x)) #find indexes of duplicates, and remove them with open("file.txt","w")as f: f.write("".join(data)) #write data back to file as string删除每个匹配项：

12474
54675
74564

file.txt的：

{{1}}

Answer 3

您可以使用OrderedDict和Counter删除重复项并维护订单：

from collections import OrderedDict, Counter

class OrderedCounter(Counter, OrderedDict):
    pass

with open('/tmp/hello.txt') as f:
    ordered_counter = OrderedCounter(f.readlines())

new_list = [k.strip() for k, v in ordered_counter.items() if v==1]
# ['12474', '54675', '74564']

Answer 4

效率最高，因为它使用count但很简单：

with open("input.txt") as f:
    orig = list(f)
    filtered = [x for x in orig if orig.count(x)==1]

print("".join(filtered))

将文件转换为行列表
创建列表理解：仅保留一次行
打印列表（由于换行符仍在行中，因此连接空字符串）

在Python中过滤文本文件中的唯一行

4 个答案: