Question

我正在尝试学习python。以下是练习的相关部分：

对于每个单词，检查单词是否已在列表中。如果单词不在列表中，将其添加到列表中。

这就是我所拥有的。

fhand = open('romeo.txt')
output = []

for line in fhand:
    words = line.split()
    for word in words:
        if word is not output:
            output.append(word)

print sorted(output)

这是我得到的。

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'sun', 'the', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder']

注意重复（并且，是，太阳等）。

我如何只获得唯一值？

Answer 1

要消除列表中的重复项，您可以维护辅助列表并进行检查。

myList = ['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and', 
     'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is', 'kill', 'light', 
     'moon', 'pale', 'sick', 'soft', 'sun', 'sun', 'the', 'the', 'the', 
     'through', 'what', 'window', 'with', 'yonder']

auxiliaryList = []
for word in myList:
    if word not in auxiliaryList:
        auxiliaryList.append(word)

<强>输出：

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east', 
  'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick',
  'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']

这很容易理解，代码是自我解释的。然而，代码简单性以代码效率为代价，因为对增长列表的线性扫描使得线性算法降级为二次方。

使用set() !,集合是无序集合，没有重复元素。
基本用途包括会员资格测试和消除重复参赛作品。

auxiliaryList = list(set(myList))

<强>输出：

['and', 'envious', 'already', 'fair', 'is', 'through', 'pale', 'yonder', 
 'what', 'sun', 'Who', 'But', 'moon', 'window', 'sick', 'east', 'breaks', 
 'grief', 'with', 'light', 'It', 'Arise', 'kill', 'the', 'soft', 'Juliet']

Answer 2

您应该使用listArr运算符来检查项目是否在列表中，而不是is not运算符：

not in

BTW，使用set效率很高（参见Time complexity）：

if word not in output:

更新 with open('romeo.txt') as fhand: output = set() for line in fhand: words = line.split() output.update(words)不保留原始订单。要保留订单，请将该集用作辅助数据结构：

set

Answer 3

这是一个“单行”，使用this implementation删除重复项，同时保留顺序：

def unique(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

output = unique([word for line in fhand for word in line.split()])

最后一行将fhand展平为单词列表，然后在结果列表中调用unique()。

Answer 4

一种方法是在添加之前查看它是否在列表中，这就是Tony的回答。如果要在创建列表后删除重复的值，可以使用set()将现有列表转换为一组唯一值，然后使用list()将其再次转换为列表。只需一行：

list(set(output))

如果要按字母顺序排序，只需在上面添加sorted()。结果如下：

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']

Answer 5


fh = open('romeo.txt')
content = fh.read()
words = content.split()

mylist = list()
for word in words:
    if word not in mylist:
        mylist.append(word)

mylist.sort()
print(mylist)

fh.close()

在python中只向列表添加唯一值

5 个答案: