根据条件删除Dataframe中的行

时间:2015-10-19 22:44:41

标签: pandas dataframe delete-row

我们假设我有一个数据框:

first_df = pd.DataFrame({"company" : ['abc','def','xyz','lmn','def','xyz'], 
                                "art_type": ['300x240','100x600','400x600','300x240','100x600','400x600'],
                                "metrics" : ['imp','rev','cpm','imp','rev','cpm'],
                                "value": [1234,23,0.5,1234,23,0.5]})
first_df = first_df.append(first_df)

我想在列表[' lmn',' xyz']中删除所有具有公司值的行,并将其存储在另一个数据帧中。

company_list = ['lmn', 'xyz']

我试过了:

deleted_data = first_df[first_df['company'] in company_list] 

这显然不起作用,因为它是列表中的列表。 for循环是这样做的方式还是有更好的方法来做到这一点?

for循环代码:

deleted_data = pd.DataFrame()
for x in company_list:
    deleted_data = deleted_data.append(first_df[first_df['company']==x])

1 个答案:

答案 0 :(得分:3)

您可以根据isin()进行过滤。

deleted_data = first_df.loc[first_df['company'].isin(company_list)]
>>> deleted_data 
  art_type company metrics   value
2  400x600     xyz     cpm     0.5
3  300x240     lmn     imp  1234.0
5  400x600     xyz     cpm     0.5
2  400x600     xyz     cpm     0.5
3  300x240     lmn     imp  1234.0
5  400x600     xyz     cpm     0.5

retained_data = first_df.loc[~first_df['company'].isin(company_list)]
>>> retained_data
  art_type company metrics  value
0  300x240     abc     imp   1234
1  100x600     def     rev     23
4  100x600     def     rev     23
0  300x240     abc     imp   1234
1  100x600     def     rev     23
4  100x600     def     rev     23