Python从列表中删除重复项并进行排序

时间:2014-12-15 09:23:01

标签: python list sorting count

我想从输出中删除重复项,因此它看起来像这样:

 Boston Americans 1
 New York Giants 5
 Chicago White Sox 3
 Chicago Cubs 2
 Pittsburgh Pirates 5

我想按字母顺序对项目进行排序并打印出来。 并且在程序的不同部分按胜利数量打印项目。

输出我得到:没有显示我的列表的每个部分,但项目计数是正确的

['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates']

Boston Americans 1
New York Giants 5
Chicago White Sox 3
Chicago Cubs 2
Chicago Cubs 2
Pittsburgh Pirates 5

这是我的代码:

def main():
winners=[]
with open("WorldSeriesWinners.txt", "r") as f:
    for line in f:
        a=line.strip()
        winners.append(a)
print(winners)

for n in winners:
    if n in winners:
        print(n, winners.count(n))

name=input("Enter some team name: ")
print("They won world cup:",  winners.count(name))

main()

4 个答案:

答案 0 :(得分:2)

这将为您提供已排序的唯一值列表:

with open("data.csv", "r") as f:
    values = sorted(set([strip(l[:-1]) for l in f.readlines()]))

print "\n".join(values)

产生

Boston Americans 1
Chicago Cubs 2
Chicago White Sox 3
New York Giants 5
Pittsburgh Pirates 5

答案 1 :(得分:0)

你可能应该使用collections.Counter(来自标准库),它是为它制作的:

import collections

l = ['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates']
c = collections.Counter(l)
for name, score in c.most_common():
    print name,  score

# or use c.get(name) to get to a team's score directly

在你的情况下,你甚至不需要中间名单(如下面Marcin的建议):

with open("WorldSeriesWinners.txt", "r") as f:
    c = collections.Counter(winner.strip() for winner in f)
# Do the for name, score... thing above

(考虑使用reversed(c.most_common())按升序获取项目)

答案 2 :(得分:0)

我对主要功能做了一些改动。它不是最美丽的代码,但我不想完全改变你的功能。为了更好的解决方案,请查看Counter或defaultdict,如另一个答案。

def main():
    winners=[]
    with open("data.csv", "r") as f:
        for line in f:
            # split the line and separate team name from the score
            split_line=line.strip().split()
            winners.append((" ".join(split_line[0:-1]), int(split_line[-1])))
    print(winners)

    # use a dict to sum up the scores
    out_dict = {};

    for team_name, score in winners:
        if team_name not in out_dict:
            out_dict[team_name] = int(score)
        else:
            out_dict[team_name] += int(score)

    # sort the out_dict using both team name and the score, and print the result
    for k,v in sorted(out_dict.items(), key = lambda kv: (kv[0],kv[1])):
        print(k,v)

结果是:

('Boston Americans', 1)
('Chicago Cubs', 6)
('Chicago White Sox', 3)
('New York Giants', 8)
('Pittsburgh Pirates', 7)

测试输入文件:

Boston Americans 1
New York Giants 5
Chicago White Sox 3
Chicago Cubs 2
New York Giants 3
Chicago Cubs 2
Pittsburgh Pirates 5
Chicago Cubs 2
Pittsburgh Pirates 2

答案 3 :(得分:0)

使用字典会将代码进一步减少为:

def main():
    winners={}
    with open("WorldSeriesWinners.txt", "r") as f:
    for line in f:
        a=line.strip()
        winners[a] = winners.setdefault(a, 0) + 1
    for winner in winners.keys().sort():
        print winner, winners[winner]

如果需要对项目计数进行排序,请使用Named Tuple而不是字典。优点是您可以使用排序方法对值进行排序。