我正在通过Al Sweigart在udemy,第29课上自动化无聊的东西课程来学习正则表达式。我收到一条错误消息,说“括号在位置414处不平衡(第12行,第1列)”
该代码用于使用正则表达式提取电话号码和电子邮件地址。
我尝试计算括号并为电子邮件正则表达式取下顶部和底部括号。
#! python3
import re, pyperclip
# Done - TODO: create a regex object for phone numbers
phoneRegex = re.compile(r'''
# Types of number 415-555-0000, 555-0000, (415) 555-0000, 555-0000 ext 12345,
# ext. 12345, x12345
(
((\d\d\d) | (\(\d\d\d\)))? # area code (optional)
(\s|-) # first separator
\d\d\d # first 3 digits
- # separator
\d\d\d\d # last 4 digits
((ext(\.)?\s)|x) # extension word part (optional)
(\d{2,5}))? # extension number part (optional)
)
''', re.VERBOSE)
# TODO: Create a regex for email addresses
emailRegex = re.compile (r'''
# some.+_thing@(\d{2,5}))?.com
[a-zA-Z0-9_.+]+ # name part - created non default regular expression class
# to capture any character a-z lowercase, A-Z upper case, numbers 0-9, characters _.+
@ # @ symbol
[a-zA-Z0-9_.+]+ # domain name part
''', re.VERBOSE)
# TODO: Get the text off the clipboard
text = pyperclip.paste()
# TODO: Extract the email/phone from this text
extractedPhone = phoneRegex.findall(text) # creates one string for each group ()
# Make sure desired regex is all in one group ()
extractedEmail = emailRegex.findall(text)
print (extractedPhone)# temporary print function to see if code works
print (extractedEmail)
给出此错误:
回溯(最近通话最近): 第18行,“ C:\ Users * \ Desktop \ Education \ Education \ Computerscience \ automaticing the无聊的东西\ programs \第29课电话和电子邮件regex.py” 在 ''',re.VERBOSE) 文件“ C:\ Users * \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ re.py”,行 234,在编译中 返回_compile(模式,标志) 文件“ C:\ Users * \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ re.py”,行 286,在_compile中 p = sre_compile.compile(模式,标志) 文件“ C:\ Users * \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ sre_compile.py”, 764行,正在编译 p = sre_parse.parse(p,标志) 文件“ C:\ Users * \ AppData \ Local \ Programs \ Python \ Python37 \ lib \ sre_parse.py”, 解析中的第944行 提高source.error(“不平衡括号”) 重新错误:位置414(第12行,第1列)上的括号不平衡
答案 0 :(得分:1)
您需要修复此行(\d{2,5}))? # extension number part (optional)
。显然,它需要添加/删除括号。
将该行更改为(\d{2,5})?
将解决unbalanced parenthesis
错误。
答案 1 :(得分:1)
使用RegexFormat 9对此格式进行格式化,它会显示不平衡并且会 可能会给您一个解决方法。
# Types of number 415-555-0000, 555-0000, (415) 555-0000, 555-0000 ext 12345,
# ext. 12345, x12345
( # (1 start)
( # (2 start), area code (optional)
( \d\d\d ) # (3)
| ( \( \d\d\d \) ) # (4)
)? # (2 end)
( \s | - ) # (5), first separator
\d\d\d # first 3 digits
- # separator
\d\d\d\d # last 4 digits
( # (6 start), extension word part (optional)
( # (7 start)
ext
( \. )? # (8)
\s
) # (7 end)
| x
) # (6 end)
( \d{2,5} ) # (9), extension number part (optional)
)? # (1 end)
= ) <-- Unbalanced ')'