我正在尝试从这样的文本中检测帐号:
Dear customer, your EMI for Axis Bank Loan a/c PPR012602183976 is due on
05-MAR-2019. Please keep your bank a/c funded to avoid additional charges.
In case 05-MAR-2019 is a holiday, it will be debited on next working day.
Thank you. 012602183976
Update: Loan closure documents for A/c 49270094 dispatched on 22-Feb-2019
via BSA LOGISTICS AWB RA7904521. Duplicate NOC request is chargeable
我正在使用的正则表达式是这样的:
(?:(((a/c)|account)\s*))([A-Z]+)?[0-9]+\b
但是,结果是
pattern = re.compile(r'(?:(((a/c)|account)\s*))([A-Z]+)?[0-9]+\b', re.IGNORECASE)
pattern.findall(text)
给我[('a/c ', 'a/c', 'a/c', 'PPR'), ('A/c ', 'A/c', 'A/c', '')]
而不是实际的帐号。
它已经捕获了非捕获组,而不是捕获组。
此正则表达式在regexpal.com中正常工作,但在pythex.org中不工作
我猜这是python特有的问题。
请提出建议。