Question

我的列名称包含以下值：

NY0528_3
NY5366_2
4536
NY1244_5
5363
PH1734_3

期望的输出：

无论我尝试过什么，我都无法获得通用解决方案，但我需要这样做，因为我有200,000行。谢谢

Answer 1

您可以使用extract：

df.Name.str.extract('(\d+)')

输出：

0    0528
1    5366
2    4536
3    1244
4    5363
5    1734
Name: Name, dtype: object

Answer 2

尝试使用正则表达式：

import re

def clean(teststring):
    return re.findall(r"[0-9]{4,4}", teststring)

如果您的数据位于df.col运行：

df.col.apply(clean)