文件到设置字典

时间:2016-12-16 19:42:24

标签: python python-2.7 dictionary

我需要将此文件内容转换为字典,以便dict中的每个键都是电影的名称,每个值都是在集合中播放的actor的名称。 我考虑过阅读文件,将每个电影名称放在一个列表中,然后将其设置为删除重复项。 然后,一个循环使每个电影名称成为关键,问题是接下来的问题?怎么做 我把值设为演员姓名?

评论:它是一个文本文件(.txt) 文件内容示例:

Brad Pitt, Sleepers, Troy, Meet Joe Black, Oceans Eleven, Seven, Mr & Mrs Smith
Tom Hanks, You have got mail, Apollo 13, Sleepless in Seattle, Catch Me If You Can
Meg Ryan, You have got mail, Sleepless in Seattle
Diane Kruger, Troy, National Treasure
Dustin Hoffman, Sleepers, The Lost City
Anthony Hopkins, Hannibal, The Edge, Meet Joe Black, Proof
Alec Baldwin, The Edge, Pearl Harbor
Angelina Jolie, Bone Collector, Lara Croft Tomb Raider, Mr & Mrs Smith
Denzel Washington, Bone Collector, The Siege, American Gangster
Julia Roberts, Pretty Woman, Oceans Eleven, Runaway Bride
Gwyneth Paltrow, Shakespeare in Love, Bounce, Proof
Russell Crowe, Gladiator, A Beautiful Mind, Cinderella Man, American Gangster
Sylvester Stallone, Rocky, Rambo, Assassins
Johnny Depp, Edward Scissorhands, The Pirates of Caribbean, Finding Neverland
Leonardo Di Caprio, Titanic, Blood Diamond, The Departed, Catch Me If You Can
Antonio Banderas, The Mask of Zorro, Desperado
Tom Cruise, Top Gun, Mission Impossible, Jerry Maguire, A Few Good Men
Kate Winslet, Titanic, Finding Neverland
George Clooney, Oceans Eleven, Intolerable Cruelty
Matt Damon, Good Will Hunting, Bourne Identity, Bourne Ultimatum, The Departed
Ben Affleck, Bounce, Good Will Hunting, Pearl Harbor
Catherine Zeta Jones, The Mask of Zorro, Intolerable Cruelty
Morgan Freeman, Bone Collector, Seven, Million Dollar Baby, Bruce Almighty
Bruce Willis, Die Hard, The Sixth Sense, Pulp Fiction, The Siege
Julianne Moore, Assassins, Hannibal
Salma Hayek, Desperado, Wild Wild West
Will Smith, Wild Wild West, Pursuit of Happyness, Hitch, Men in Black
Kevin Bacon, A Few Good Men, Sleepers
Jim Carrey, The Mask, Bruce Almighty, Ace Ventura, Me-Myself & Irene
Renee Zellweger, Me-Myself & Irene, Jerry Maguire, Cinderella Man

1 个答案:

答案 0 :(得分:0)

即使输入是文本文件,也可以使用Python csv模块读取它,因为它看起来格式正确,这样做可以处理数据并创建字典本身相对直接

格式化结果有点曲折,因为我希望它只是某种方式 - 但这不是你要问的,对吧? ;-)

#!/usr/bin/env python2
import csv

# Create and populate the target dictionary from the data in the file.
movie_actors_dict = {}
with open('actors.txt', 'rb') as csvfile:
    for row in csv.reader(csvfile, skipinitialspace=True):
        actor, movies = row[0], row[1:]
        for movie in movies:
            movie_actors_dict.setdefault(movie, set()).add(actor)

# Display resulting dictionary.
print('{')
representation = ',\n'.join(('    {!r}: {{\n'.format(movie)
                             + ',\n'.join('        {!r}'.format(actor)
                                for actor in sorted(movie_actors_dict[movie]))
                             + '\n    }') for movie in sorted(movie_actors_dict))
print(representation)
print('}')

示例输出:

{
    'A Beautiful Mind': {
        'Russell Crowe'
    },
    'A Few Good Men': {
        'Kevin Bacon',
        'Tom Cruise'
    },
    'Ace Ventura': {
        'Jim Carrey'
    },
    'American Gangster': {
        'Denzel Washington',
        'Russell Crowe'
    },
    'Apollo 13': {
        'Tom Hanks'
    },
    'Assassins': {
        'Julianne Moore',
        'Sylvester Stallone'
    },
    'Blood Diamond': {
        'Leonardo Di Caprio'
    },
    'Bone Collector': {
        'Angelina Jolie',
        'Denzel Washington',
        'Morgan Freeman'
    },
    ... etc, etc
}