Question

我有一个像这样的数据框：

df
col1          col2      col3
01/01/10      abc       pqr
10/10/18      sps       ggg
date          pqp       fdf
03/12/19      rt        sd
summary       re        ss

所有列都是字符串类型，我想删除那些值不是任何日期的行。

输出df应该看起来像

df
col1          col2      col3
01/01/10      abc       pqr
10/10/18      sps       ggg
03/12/19      rt        sd

如何在python中以最有效的方式做到这一点？

Answer 1

您可以将pd.to_datetime()与errors=‘coerce’结合使用：来自文档：

如果为“强制”，则无效解析将设置为NaT

df.loc[pd.to_datetime(df.col1,errors='coerce').dropna().index]

       col1 col2 col3
0  01/01/10  abc  pqr
1  10/10/18  sps  ggg
3  03/12/19   rt   sd

或者，如果您希望col1成为日期时间列，请使用：

df.col1=pd.to_datetime(df.col1,errors='coerce')
df[df.col1.notna()]

Answer 2

使用re.findall

tap="functionName"

输出

df2[df2.apply(lambda x: True if len(re.findall('\d{2}/\d{2}/\d{2}',x.col1)) >= 1 else False, axis=1)]

如何在特定列值不是日期且所有列均为字符串类型的地方删除行

2 个答案: