Question

我有一个文本行列表：textlines这是一个字符串列表（以'\n'结尾）。

我想删除多次出现的行，不包括仅包含空格，换行符和制表符的行。

换句话说，如果原始列表是：

textlines[0] = "First line\n"
textlines[1] = "Second line \n"
textlines[2] = "   \n"
textlines[3] = "First line\n"
textlines[4] = "   \n"

输出列表将是：

textlines[0] = "First line\n"
textlines[1] = "Second line \n"
textlines[2] = "   \n"
textlines[3] = "   \n"

怎么做？

Answer 1

seen = set()
res = []
for line in textlines:
    if line not in seen:
        res.append(line)
        if not line.strip():
            seen.add(line)
textlines = res

Answer 2

因为我无法抗拒一个好的代码打高尔夫球：

seen = set()

[x for x in textlines if (x not in seen or not x.strip()) and not seen.add(x)]
Out[29]: ['First line\n', 'Second line \n', '   \n', '   \n']

这相当于@ hughbothwell的回答。如果您打算让人类阅读您的代码，您应该使用哪些： - ）

Answer 3

new = []
for line in textlines:
    if line in new and line.strip():
        continue
    new.append(line)
textlines = new

删除除特定值以外的多个匹配项？

3 个答案: