我有一个文本行列表:textlines
这是一个字符串列表(以'\n'
结尾)。
我想删除多次出现的行,不包括仅包含空格,换行符和制表符的行。
换句话说,如果原始列表是:
textlines[0] = "First line\n"
textlines[1] = "Second line \n"
textlines[2] = " \n"
textlines[3] = "First line\n"
textlines[4] = " \n"
输出列表将是:
textlines[0] = "First line\n"
textlines[1] = "Second line \n"
textlines[2] = " \n"
textlines[3] = " \n"
怎么做?
答案 0 :(得分:3)
seen = set()
res = []
for line in textlines:
if line not in seen:
res.append(line)
if not line.strip():
seen.add(line)
textlines = res
答案 1 :(得分:1)
因为我无法抗拒一个好的代码打高尔夫球:
seen = set()
[x for x in textlines if (x not in seen or not x.strip()) and not seen.add(x)]
Out[29]: ['First line\n', 'Second line \n', ' \n', ' \n']
这相当于@ hughbothwell的回答。如果您打算让人类阅读您的代码,您应该使用哪些: - )
答案 2 :(得分:0)
new = []
for line in textlines:
if line in new and line.strip():
continue
new.append(line)
textlines = new