如何比较两个数据框中的值

时间:2019-01-11 17:24:30

标签: python pandas dataframe

我有一个Pandas DataFrame(tmdb_df),其中包含有关电影的数据(来自TMDB)。其中一列是“类型”,其中列出了适用于该电影的类型列表,并以“ |”分隔。 dataFrame还包含收入,受欢迎程度,评分等。我希望能够按流派查看这些指标。

示例tmdb_df['genres'].head()

0    Action|Adventure|Science Fiction|Thriller
1    Action|Adventure|Science Fiction|Thriller
2           Adventure|Science Fiction|Thriller
3     Action|Adventure|Science Fiction|Fantasy
4                        Action|Crime|Thriller 

我首先创建了一个具有所有独特流派的新DataFrame。

代码:

all_genres = tmdb_df['genres'].str.split("|", expand=True)
all_genres.rename(columns=lambda x: "col" + str(x+1), inplace=True)

unique_genres = pd.DataFrame({'genres': pd.unique(all_genres[['col1', 'col2','col3', 'col4','col5']].values.ravel('K'))})
unique_genres.sort_values('genres', inplace=True)
unique_genres.reset_index(inplace=True)
unique_genres.drop(columns='index', inplace=True)

unique_genres.head()

输出:

    genres
0   Action
1   Adventure
2   Animation
3   Comedy
4   Crime

现在,我想进行一些汇总统计,以了解每种流派发生的频率,平均收视率等。我知道我需要比较两个数据帧中的值,但似乎无法做到这一点。我已经尝试过各种for循环,但总是会出错。

示例:

for row in tmdb_df.itertuples():
    if unique_genres['genres'].str.contains(row[14]):
        print("true")

错误:

  

ValueError:系列的真值不明确。使用a.empty,a.bool(),a.item(),a.any()或a.all()。

Stack对于使我走到这一步一直非常有帮助,但我现在仍处于困境。预先感谢您的帮助!

编辑:如果有帮助,请在这里完整的tmdb_df数据框。

+---+--------+-----------+------------+-----------+------------+------------------------------+---------------------------------------------------+---------------------------------------------------+------------------+-------------------------------+-----+---------------------------------------------------+---------+-------------------------------------------+---------------------------------------------------+--------------+------------+--------------+--------------+-------------+--------------+
|   |   id   |  imdb_id  | popularity |  budget   |  revenue   |        original_title        |                       cast                        |                     homepage                      |     director     |            tagline            | ... |                     overview                      | runtime |                  genres                   |               production_companies                | release_date | vote_count | vote_average | release_year | budget_adj  | revenue_adj  |
+---+--------+-----------+------------+-----------+------------+------------------------------+---------------------------------------------------+---------------------------------------------------+------------------+-------------------------------+-----+---------------------------------------------------+---------+-------------------------------------------+---------------------------------------------------+--------------+------------+--------------+--------------+-------------+--------------+
| 0 | 135397 | tt0369610 |  32.985763 | 150000000 | 1513528810 | Jurassic World               | Chris Pratt|Bryce Dallas Howard|Irrfan Khan|Vi... | http://www.jurassicworld.com/                     | Colin Trevorrow  | The park is open.             | ... | Twenty-two years after the events of Jurassic ... |     124 | Action|Adventure|Science Fiction|Thriller | Universal Studios|Amblin Entertainment|Legenda... | 6/09/15      |       5562 |          6.5 |         2015 | 137999939.3 | 1.392446e+09 |
| 1 |  76341 | tt1392190 |  28.419936 | 150000000 |  378436354 | Mad Max: Fury Road           | Tom Hardy|Charlize Theron|Hugh Keays-Byrne|Nic... | http://www.madmaxmovie.com/                       | George Miller    | What a Lovely Day.            | ... | An apocalyptic story set in the furthest reach... |     120 | Action|Adventure|Science Fiction|Thriller | Village Roadshow Pictures|Kennedy Miller Produ... | 5/13/15      |       6185 |          7.1 |         2015 | 137999939.3 | 3.481613e+08 |
| 2 | 262500 | tt2908446 |  13.112507 | 110000000 |  295238201 | Insurgent                    | Shailene Woodley|Theo James|Kate Winslet|Ansel... | http://www.thedivergentseries.movie/#insurgent    | Robert Schwentke | One Choice Can Destroy You    | ... | Beatrice Prior must confront her inner demons ... |     119 | Adventure|Science Fiction|Thriller        | Summit Entertainment|Mandeville Films|Red Wago... | 3/18/15      |       2480 |          6.3 |         2015 | 101199955.5 | 2.716190e+08 |
| 3 | 140607 | tt2488496 |  11.173104 | 200000000 | 2068178225 | Star Wars: The Force Awakens | Harrison Ford|Mark Hamill|Carrie Fisher|Adam D... | http://www.starwars.com/films/star-wars-episod... | J.J. Abrams      | Every generation has a story. | ... | Thirty years after defeating the Galactic Empi... |     136 | Action|Adventure|Science Fiction|Fantasy  | Lucasfilm|Truenorth Productions|Bad Robot         | 12/15/15     |       5292 |          7.5 |         2015 | 183999919.0 | 1.902723e+09 |
| 4 | 168259 | tt2820852 |   9.335014 | 190000000 | 1506249360 | Furious 7                    | Vin Diesel|Paul Walker|Jason Statham|Michelle ... | http://www.furious7.com/                          | James Wan        | Vengeance Hits Home           | ... | Deckard Shaw seeks revenge against Dominic Tor... |     137 | Action|Crime|Thriller                     | Universal Pictures|Original Film|Media Rights ... | 4/01/15      |       2947 |          7.3 |         2015 | 174799923.1 | 1.385749e+09 |
+---+--------+-----------+------------+-----------+------------+------------------------------+---------------------------------------------------+---------------------------------------------------+------------------+-------------------------------+-----+---------------------------------------------------+---------+-------------------------------------------+---------------------------------------------------+--------------+------------+--------------+--------------+-------------+--------------+

0 个答案:

没有答案