我需要从文件(* .txt)中提取唯一的字符串。但我的代码编写,以便重复相同的行。我需要一次发出每个唯一的字符串。??
import re
f=open('C:\\isg-2000.txt')
p=f.readlines()
print len(p)
for i in range(len(p)):
S = re.findall(r'set vrouter \".+?\"',p[i])
if S:
print S
这样的输出:
4438
['set vrouter "untrust-vr"']
['set vrouter "trust-vr"']
['set vrouter "UntrustGi-vr"']
['set vrouter "TrustGi-vr"']
['set vrouter "CNDT-vr"']
['set vrouter "MGT"']
['set vrouter "MGT"']
['set vrouter "MGT"']
['set vrouter "untrust-vr"']
['set vrouter "trust-vr"']
['set vrouter "UntrustGi-vr"']
['set vrouter "TrustGi-vr"']
['set vrouter "CNDT-vr"']
['set vrouter "MGT"']
['set vrouter "untrust-vr"']
['set vrouter "trust-vr"']
['set vrouter "UntrustGi-vr"']
['set vrouter "TrustGi-vr"']
['set vrouter "CNDT-vr"']
['set vrouter "MGT"']
答案 0 :(得分:2)
将set
与生成器表达式一起使用:
import re
with open('C:\\isg-2000.txt') as f:
r = re.compile(r'set vrouter \".+?\"')
unique_matches = set(m for line in f for m in r.findall(line))
请注意,如果订单问题使用collections.OrderedDict
from collections import OrderedDict
...
unique_matches = list(OrderedDict.fromkeys(m for line in f for m in r.findall(line)))
答案 1 :(得分:1)
请试试这个:
import re
f=open('C:\\Users\\vlazarev\\Desktop\\isg-2000-1-2013-08-14_for_amt.txt')
s = set(re.findall(r'set vrouter \".+?\"', f.read()))
print s