汇总列表中的多行-Python

时间:2018-09-19 15:27:42

标签: python duplicates aggregation

我正在尝试从列表中汇总多个值,以免最终结果中出现重复项。

我有此代码:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    # Print the movie's genres
    for genre in movie['genres']:
        cast = movie.get('cast')
        topActors = 2
        i = i+1;
        for actor in cast[:topActors]:
            if i <= 10:
                print('Movie: ', movie, ' Genre: ', genre, 'Actors: ', actor['name']);
            else:
                break;

我得到这个:

Movie:  The Shawshank Redemption  Genre:  Drama Actors:  Tim Robbins
Movie:  The Shawshank Redemption  Genre:  Drama Actors:  Morgan Freeman
Movie:  The Godfather  Genre:  Crime Actors:  Marlon Brando
Movie:  The Godfather  Genre:  Crime Actors:  Al Pacino
Movie:  The Godfather  Genre:  Drama Actors:  Marlon Brando
Movie:  The Godfather  Genre:  Drama Actors:  Al Pacino
Movie:  The Godfather: Part II  Genre:  Crime Actors:  Al Pacino
Movie:  The Godfather: Part II  Genre:  Crime Actors:  Robert Duvall
Movie:  The Godfather: Part II  Genre:  Drama Actors:  Al Pacino
Movie:  The Godfather: Part II  Genre:  Drama Actors:  Robert Duvall
Movie:  The Dark Knight  Genre:  Action Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Action Actors:  Heath Ledger
Movie:  The Dark Knight  Genre:  Crime Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Crime Actors:  Heath Ledger
Movie:  The Dark Knight  Genre:  Drama Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Drama Actors:  Heath Ledger
Movie:  The Dark Knight  Genre:  Thriller Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Thriller Actors:  Heath Ledger
Movie:  12 Angry Men  Genre:  Crime Actors:  Martin Balsam
Movie:  12 Angry Men  Genre:  Crime Actors:  John Fiedler

是否可能获得以下汇总?

Movie:  The Shawshank Redemption  Genre:  Drama Actors:  Tim Robbins|Morgan Freeman
Movie:  The Godfather  Genre:  Crime|Drama  Actors:  Marlon Brando|Al Pacino|
Movie:  The Godfather: Part II  Genre:  Crime|Drama  Actors:  Al Pacino|Robert Duvall
Movie:  The Dark Knight  Genre:  Action|Crime|Drama|Thriller   Actors:  Christian Bale|Heath Ledger
Movie:  12 Angry Men  Genre:  Crime Actors:  Martin Balsam|John Fiedler

可以使用列表获取此信息吗?还是我需要将所有结果存储在数据框中,然后使用groupby?

谢谢!

1 个答案:

答案 0 :(得分:3)

尝试使用str.join()方法:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    cast = movie.get('cast')
    topActors = 2
    i = i+1;
    actor_names= [actor['name'] for actor in cast[:topActors] ]
    if i <= 10:
         print('Movie: ', movie, ' Genre: ', '|'.join(movie['genres']), 'Actors: ', '|'.join(actor_names));
    else:
         break;