我们假设我有一个数据框:
first_df = pd.DataFrame({"company" : ['abc','def','xyz','lmn','def','xyz'],
"art_type": ['300x240','100x600','400x600','300x240','100x600','400x600'],
"metrics" : ['imp','rev','cpm','imp','rev','cpm'],
"value": [1234,23,0.5,1234,23,0.5]})
first_df = first_df.append(first_df)
我想在列表[' lmn',' xyz']中删除所有具有公司值的行,并将其存储在另一个数据帧中。
company_list = ['lmn', 'xyz']
我试过了:
deleted_data = first_df[first_df['company'] in company_list]
这显然不起作用,因为它是列表中的列表。 for循环是这样做的方式还是有更好的方法来做到这一点?
for循环代码:
deleted_data = pd.DataFrame()
for x in company_list:
deleted_data = deleted_data.append(first_df[first_df['company']==x])
答案 0 :(得分:3)
您可以根据isin()
进行过滤。
deleted_data = first_df.loc[first_df['company'].isin(company_list)]
>>> deleted_data
art_type company metrics value
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
retained_data = first_df.loc[~first_df['company'].isin(company_list)]
>>> retained_data
art_type company metrics value
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23