如何根据另一个数据集中的键过滤一个数据集

时间:2019-08-02 15:00:15

标签: python pandas numpy

我有一个图书评分数据集,如下所示:

ratings.head()

    User-ID     ISBN    Book-Rating
0   276725  034545104X  0
1   276726  0155061224  5
2   276727  0446520802  0
3   276729  052165615X  3
4   276729  0521795028  6

并且我想按喜欢特定书籍的用户过滤数据集。

我尝试过:

lotr_ratings = ratings[ratings['ISBN'] == '0345339703'] 
liked_lotr = lotr_ratings[lotr_rating['Book-Rating'] == 10] #readers who like lotr
liked_lotr = liked_lotr['User-ID'].to_frame() 
ratings[ratings['User-ID'] == liked_lotr] # Filter the original dataset

失败:

  

MemoryError

我们将不胜感激。谢谢。

1 个答案:

答案 0 :(得分:1)

看起来您只想基于多个条件创建一个新的数据框。这样做:

conditions = (ratings['ISBN'] == '0345339703') & (ratings['Book-Rating'] == 10)

like_lotr = ratings[conditions]