排除两个连续大写字母的正则表达式

时间:2018-08-18 08:53:38

标签: python regex regex-lookarounds regex-group

我很难使用正则表达式来解决此表达式,

e.g when given below: 
regex_exp(address, "OG 56432") 


它应该返回

"OG 56432: Middle Street Pollocksville | 686"


地址是一个字符串数组:

address = [
  "622 Gordon Lane St. Louisville OH 52071",
  "432 Main Long Road St. Louisville OH 43071",
  "686 Middle Street Pollocksville OG 56432"
]


我的解决方案当前看起来像这样(Python):

import re
def regex_exp(address, zipcode):
    for i in address:
        if zipcode in i:
            postal_code = (re.search("[A-Z]{2}\s[0-9]{5}", x)).group(0)
            # returns "OG 56432"

            digits = (re.search("\d+", x)).group(0)
            # returns "686"

            address = (re.search("\D+", x)).group(0)
            # returns "Middle Street Pollocksville OG"

            print(postal_code + ":" + address + "| " + digits)

regex_exp(address, "OG 56432")
# returns OG 56432: High Street Pollocksville OG | 686

从第二段可以看出,这不是正确的答案-我需要返回的值是

"OG 56432: Middle Street Pollocksville | 686"

如何处理地址变量正则表达式搜索以排除2个大写连续大写字母?我已经尝试过类似的事情

address = (re.search("?!\D+", x)).group(0)

删除基于A regular expression to exclude a word/string的两个连续大写字母,但我认为这是朝错误方向迈出的一步。

PS:我知道有更简单的方法可以解决此问题,但是我想使用正则表达式来改善我的基础知识

2 个答案:

答案 0 :(得分:0)

如果您只想删除两个连续的大写字母,它们是邮政编码(5位数字)的前身,请使用此

import re
text = "432 Main Long PC Market Road St. Louisville OG 43071"
address = re.sub(r'([A-Z]{2}[\s]{1})(?=[\d]{5})','',text)
print(address) 
# Output: 432 Main Long PC Market Road St. Louisville 43071

用于删除所有连续出现的两个大写字母:

import re 
text = "432 Main Long PC Market Road St. Louisville OG 43071" 
address = re.sub(r'([A-Z]{2}[\s]{1})(?=[\d]{5})','',text)
print(address) 
# Output: 432 Main Long Market Road St. Louisville 43071

答案 1 :(得分:0)

通过re.sub()和组捕获,您可以使用:

$pingArgs = @(
    '-t', '-a', '-4', '-r', '5', '127.0.0.1'
)
& PING.exe @pingArgs