我在df_init中有这样的数据框
column1
0 hi all, i am fine
1 How are you ? 123 a45
2 123444234324!!! (This is also string)
3 sdsfds sdfsdf 233423
5 adsfd xcvbb cbcvbcvcbc
我想从这个具有数字或字母数字的数据框中获取所有这些值 我期望在df_final中像这样
column1
0 How are you ? 123 a45
1 123444234324!!! (This is also string)
2 sdsfds sdfsdf 233423
答案 0 :(得分:1)
将str.contains
与\d
配合使用,并按boolean indexing
进行过滤:
df = df[df.column1.str.contains('\d')]
print (df)
column1
1 How are you ? 123 a45
2 123444234324!!! (This is also string)
3 sdsfds sdfsdf 233423
编辑:
print (df)
column1
0 hi all, i am fine d78
1 How are you ? 123 a45
2 123444234324!!!
3 sdsfds sdfsdf 233423
4 adsfd xcvbb cbcvbcvcbc
5 234324@
6 123! vc
df = df[df.column1.str.contains(r'^\d+[<!\-[.*?\]>@]+$')]
print (df)
column1
2 123444234324!!!
5 234324@