我可以pd_data = pd_data[pd_data['db_rating']>0]
来过滤数据,选择db_rating > 0
的记录。
现在我想要涉及其他列,例如,同时选择db_rating>0
和imdb_ratings_count>1000
。
但是
pd_data = pd_data[pd_data['db_rating']>0 and pd_data['imdb_ratings_count']>1000]
给了我错误
ValueError Traceback (most recent call last)
<ipython-input-120-f83883d4bac8> in <module>()
3 pd_data['imdb_rating'] = pd_data['imdb_rating'].astype(float)
4 pd_data['imdb_ratings_count'] = pd_data['imdb_ratings_count'].astype(float)
----> 5 pd_data = pd_data[pd_data['db_rating']>0 and pd_data['imdb_ratings_count']>1000]
6 pd_data.describe()
D:\Anaconda2\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
696 raise ValueError("The truth value of a {0} is ambiguous. "
697 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 698 .format(self.__class__.__name__))
699
700 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我该怎么做?
答案 0 :(得分:2)
Pandas为此覆盖了布尔&
运算符。这应该有效:
pd_data = pd_data[(pd_data['db_rating']>0) & (pd_data['imdb_ratings_count']>1000)]
请参阅http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
答案 1 :(得分:2)
在pandas中使用布尔向量时使用按位运算符:
pd_data = pd_data[(pd_data['db_rating']>0) & (pd_data['imdb_ratings_count']>1000)]