Python Pandas:通过应用正则表达式过滤Dataframe

时间:2014-02-28 10:50:05

标签: python regex pandas

我能够根据这样的列表元素过滤数据框;

import pandas as pd
W1 = ['Animal','Ball','Cat','Derry','Element','Lapse','Animate this']
W2 = ['Krota','Catch','Yankee','Global','Zeb','Rat','Try']
df = pd.DataFrame({'W1':W1,'W2':W2})

l1 = ['Animal','Zeb','Q']
print df[df['W1'].isin(l1) | df['W2'].isin(l1)]

        W1     W2
  0   Animal  Krota
  4  Element    Zeb

但有没有办法通过应用正则表达式进行过滤;      对于前者;

 l1 = ['An','Cat']

 Intended result;
          W1         W2
  0   Animal        Krota
  1   Ball          Catch  
  2   Cat           Yankee
  6   Animate this  Try 

1 个答案:

答案 0 :(得分:6)

试试这个:

df[df['W1'].str.contains("|".join(l1)) | df['W2'].str.contains("|".join(l1))]


             W1      W2
0        Animal   Krota
1          Ball   Catch
2           Cat  Yankee
6  Animate this     Try