我有一个数据框,可以从下面的代码中生成
df = pd.DataFrame({'person_id' :[1,2,3],'date1': ['12/31/2007','11/25/2009','10/06/2005'],'date1derived':[0,0,0],'val1':[2,4,6],'date2': ['12/31/2017','11/25/2019','10/06/2015'],'date2derived':[0,0,0],'val2':[1,3,5],'date3':['12/31/2027','11/25/2029','10/06/2025'],'date3derived':[0,0,0],'val3':[7,9,11]})
数据框如下图所示
我想删除名称中包含“派生”的列。我尝试了其他正则表达式,但无法获得预期的输出。
df = df.filter(regex='[^H\dDerived]+', axis=1)
df = df.filter(regex='[^Derived]',axis=1)
您能告诉我正确的正则表达式吗?
答案 0 :(得分:1)
df[[c for c in df.columns if 'derived' not in c ]]
输出
person_id date1 val1 date2 val2 date3 val3
0 1 12/31/2007 2 12/31/2017 1 12/31/2027 7
1 2 11/25/2009 4 11/25/2019 3 11/25/2029 9
2 3 10/06/2005 6 10/06/2015 5 10/06/2025 11
答案 1 :(得分:1)
您可以使用零宽度的负前瞻来确保字符串derived
不会出现在任何地方:
^(?!.*?derived)
^
匹配字符串的开头(?!.*?derived)
是否定超前模式,可确保derived
不出现在字符串中您的模式[^Derived]
将与D / e / r / i / v / e / d之一之外的任何单个字符匹配。
答案 2 :(得分:1)
IIUC,您要删除的列中包含derived
。应该这样做:
df.drop(df.filter(like='derived').columns, 1)
Out[455]:
person_id date1 val1 date2 val2 date3 val3
0 1 12/31/2007 2 12/31/2017 1 12/31/2027 7
1 2 11/25/2009 4 11/25/2019 3 11/25/2029 9
2 3 10/06/2005 6 10/06/2015 5 10/06/2025 11
答案 3 :(得分:1)
pd.Index.difference()
和df.filter()
df[df.columns.difference(df.filter(like='derived').columns,sort=False)]
person_id date1 val1 date2 val2 date3 val3
0 1 12/31/2007 2 12/31/2017 1 12/31/2027 7
1 2 11/25/2009 4 11/25/2019 3 11/25/2029 9
2 3 10/06/2005 6 10/06/2015 5 10/06/2025 11
答案 4 :(得分:1)
在最新版本的熊猫中,可以在索引和列上使用字符串方法。在这里,str.endswith似乎很合适。
import pandas as pd
df = pd.DataFrame({'person_id' :[1,2,3],'date1': ['12/31/2007','11/25/2009','10/06/2005'],
'date1derived':[0,0,0],'val1':[2,4,6],'date2': ['12/31/2017','11/25/2019','10/06/2015'],
'date2derived':[0,0,0],'val2':[1,3,5],'date3':['12/31/2027','11/25/2029','10/06/2025'],
'date3derived':[0,0,0],'val3':[7,9,11]})
df = df.loc[:,~df.columns.str.endswith('derived')]
print(df)
O / P:
person_id date1 val1 date2 val2 date3 val3
0 1 12/31/2007 2 12/31/2017 1 12/31/2027 7
1 2 11/25/2009 4 11/25/2019 3 11/25/2029 9
2 3 10/06/2005 6 10/06/2015 5 10/06/2025 11