Question

我需要一点帮助。

我是Python的新手（我使用与Anaconda捆绑的3.0版本），我想使用正则表达式来验证/返回仅符合条件的有效数字列表（比如说\ d {11}为11位）。我正在使用Pandas

获取列表

df = pd.DataFrame(columns=['phoneNumber','count'], data=[
    ['08034303939',11],
    ['08034382919',11],
    ['0802329292',10],
    ['09039292921',11]])

当我使用

返回所有项目时

for row in df.iterrows(): # dataframe.iterrows() returns tuple
    print(row[1][0])

它返回所有没有正则表达式验证的项目，但是当我尝试使用此

进行验证时

for row in df.iterrows(): # dataframe.iterrows() returns tuple
    print(re.compile(r"\d{11}").search(row[1][0]).group())

它返回一个属性错误（因为非匹配值的返回值是None。

我如何解决这个问题，还是有更简单的方法？

Answer 1

如果要验证，可以使用str.match并使用df.astype(bool)转换为布尔掩码：

revalidate()

您可以使用布尔索引仅返回包含有效电话号码的行。

x = df['phoneNumber'].str.match(r'\d{11}').astype(bool)
x

0     True
1     True
2    False
3     True
Name: phoneNumber, dtype: bool

使用pandas

1 个答案: