Question

我似乎无法让这个工作。我需要打开文件ranger.txt。读取每一行，然后将每一行分成一个单词列表。检查每个单词是否已在列表中。如果单词不在列表中，则将其添加到列表中。在程序结束时，按字母顺序对生成的单词进行排序和打印。

结果应该是： [“a”，“and”“buckle”，“C130”，“countrollin”，“door”，“down”，“four”，“Gonna”，“Jump”，“little”，“out”，“ranger” “，”Recon“，”right“，”Shuffle“，”Standstrip“，”the“，”to“，”take“，”trip“，”up“]

我可以打印单个列表，甚至每个列表中的一个单词，但就是这样。

rangerHandle = open("ranger.txt")
count = 0
rangerList = list()

for line in rangerHandle:
    line = line.rstrip()
    #print line works at this point
    words = line.split() # split breaks string makes another list
    #print words works at this point
    if words[count] not in words: 
        rangerList.append(words[count])        
        count += 1
    print rangerList

ranger.txt文件是：

C130 rollin down the strip
Recon ranger
Gonna take a little trip
Stand up, buckle up,
Shuffle to the door
Jump right out and count to four

如果你要投票，请至少给出解释。

Answer 1

我们可以创建列表而不会发现重复项。稍后我们将通过将列表转换为集合来删除它们。然后我们通过不区分大小写的排序对集合进行排序：

with open("ranger.txt") as f:
    l = [w for line in f for w in line.strip().split()]
print(sorted(set(l), key=lambda s: s.lower()))

结果：

[
    'a', 'and', 'buckle', 'C130', 'count', 'door', 'down', 'four', 
    'Gonna', 'Jump', 'little', 'out', 'ranger', 'Recon', 'right', 
    'rollin', 'Shuffle', 'Stand', 'strip', 'take', 'the', 'to', 'trip',
    'up,'
]

Answer 2

首先，在处理文件（https://docs.python.org/2/tutorial/inputoutput.html）时最好使用with ...语法。

其次，如果我是你，我会使用集合（https://docs.python.org/2/library/sets.html）而不是列表。它们的优点是您不能两次添加相同的元素，因此您无需检查该单词是否已在集合中。对于每一行，我将创建一个包含该行上的单词的新集合，并使用union方法将其与其他单词合并。

words = set([])
with open("ranger.txt") as f:
     for line in f:
         newset = set(line.rstrip().split())
         words = words.union(newset)
words = sorted(words) ## this line transforms the set into a sorted list

Python过滤和排序列表

2 个答案: