我有一个Pandas DataFrame(tmdb_df),其中包含有关电影的数据(来自TMDB)。其中一列是“类型”,其中列出了适用于该电影的类型列表,并以“ |”分隔。 dataFrame还包含收入,受欢迎程度,评分等。我希望能够按流派查看这些指标。
示例tmdb_df['genres'].head()
:
0 Action|Adventure|Science Fiction|Thriller
1 Action|Adventure|Science Fiction|Thriller
2 Adventure|Science Fiction|Thriller
3 Action|Adventure|Science Fiction|Fantasy
4 Action|Crime|Thriller
我首先创建了一个具有所有独特流派的新DataFrame。
代码:
all_genres = tmdb_df['genres'].str.split("|", expand=True)
all_genres.rename(columns=lambda x: "col" + str(x+1), inplace=True)
unique_genres = pd.DataFrame({'genres': pd.unique(all_genres[['col1', 'col2','col3', 'col4','col5']].values.ravel('K'))})
unique_genres.sort_values('genres', inplace=True)
unique_genres.reset_index(inplace=True)
unique_genres.drop(columns='index', inplace=True)
unique_genres.head()
输出:
genres
0 Action
1 Adventure
2 Animation
3 Comedy
4 Crime
现在,我想进行一些汇总统计,以了解每种流派发生的频率,平均收视率等。我知道我需要比较两个数据帧中的值,但似乎无法做到这一点。我已经尝试过各种for循环,但总是会出错。
示例:
for row in tmdb_df.itertuples():
if unique_genres['genres'].str.contains(row[14]):
print("true")
错误:
ValueError:系列的真值不明确。使用a.empty,a.bool(),a.item(),a.any()或a.all()。
Stack对于使我走到这一步一直非常有帮助,但我现在仍处于困境。预先感谢您的帮助!
编辑:如果有帮助,请在这里完整的tmdb_df数据框。
+---+--------+-----------+------------+-----------+------------+------------------------------+---------------------------------------------------+---------------------------------------------------+------------------+-------------------------------+-----+---------------------------------------------------+---------+-------------------------------------------+---------------------------------------------------+--------------+------------+--------------+--------------+-------------+--------------+
| | id | imdb_id | popularity | budget | revenue | original_title | cast | homepage | director | tagline | ... | overview | runtime | genres | production_companies | release_date | vote_count | vote_average | release_year | budget_adj | revenue_adj |
+---+--------+-----------+------------+-----------+------------+------------------------------+---------------------------------------------------+---------------------------------------------------+------------------+-------------------------------+-----+---------------------------------------------------+---------+-------------------------------------------+---------------------------------------------------+--------------+------------+--------------+--------------+-------------+--------------+
| 0 | 135397 | tt0369610 | 32.985763 | 150000000 | 1513528810 | Jurassic World | Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi... | http://www.jurassicworld.com/ | Colin Trevorrow | The park is open. | ... | Twenty-two years after the events of Jurassic ... | 124 | Action|Adventure|Science Fiction|Thriller | Universal Studios|Amblin Entertainment|Legenda... | 6/09/15 | 5562 | 6.5 | 2015 | 137999939.3 | 1.392446e+09 |
| 1 | 76341 | tt1392190 | 28.419936 | 150000000 | 378436354 | Mad Max: Fury Road | Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic... | http://www.madmaxmovie.com/ | George Miller | What a Lovely Day. | ... | An apocalyptic story set in the furthest reach... | 120 | Action|Adventure|Science Fiction|Thriller | Village Roadshow Pictures|Kennedy Miller Produ... | 5/13/15 | 6185 | 7.1 | 2015 | 137999939.3 | 3.481613e+08 |
| 2 | 262500 | tt2908446 | 13.112507 | 110000000 | 295238201 | Insurgent | Shailene Woodley|Theo James|Kate Winslet|Ansel... | http://www.thedivergentseries.movie/#insurgent | Robert Schwentke | One Choice Can Destroy You | ... | Beatrice Prior must confront her inner demons ... | 119 | Adventure|Science Fiction|Thriller | Summit Entertainment|Mandeville Films|Red Wago... | 3/18/15 | 2480 | 6.3 | 2015 | 101199955.5 | 2.716190e+08 |
| 3 | 140607 | tt2488496 | 11.173104 | 200000000 | 2068178225 | Star Wars: The Force Awakens | Harrison Ford|Mark Hamill|Carrie Fisher|Adam D... | http://www.starwars.com/films/star-wars-episod... | J.J. Abrams | Every generation has a story. | ... | Thirty years after defeating the Galactic Empi... | 136 | Action|Adventure|Science Fiction|Fantasy | Lucasfilm|Truenorth Productions|Bad Robot | 12/15/15 | 5292 | 7.5 | 2015 | 183999919.0 | 1.902723e+09 |
| 4 | 168259 | tt2820852 | 9.335014 | 190000000 | 1506249360 | Furious 7 | Vin Diesel|Paul Walker|Jason Statham|Michelle ... | http://www.furious7.com/ | James Wan | Vengeance Hits Home | ... | Deckard Shaw seeks revenge against Dominic Tor... | 137 | Action|Crime|Thriller | Universal Pictures|Original Film|Media Rights ... | 4/01/15 | 2947 | 7.3 | 2015 | 174799923.1 | 1.385749e+09 |
+---+--------+-----------+------------+-----------+------------+------------------------------+---------------------------------------------------+---------------------------------------------------+------------------+-------------------------------+-----+---------------------------------------------------+---------+-------------------------------------------+---------------------------------------------------+--------------+------------+--------------+--------------+-------------+--------------+