我正在尝试返回以下字符串之一(取决于输入):
f23/24 /or/ f23-24 /or/ f23+24
(理想情况下,始终返回格式为f23-24的代码都很好),无论输入
通过这种类型的字符串:
build-f23/24 1st pass demo (50:50) #Should output f23-24 or f23/24
build-f17-22 1st pass demo (50:50) #Should output f17-22
build-f-1 +14 1st pass demo (50:50) #Should output f1-14 or f1+14
例外:
某些字符串将没有第二组数字:
build-f45 1st pass demo (50:50) #Should output f45
我当前所在的位置:
到目前为止,我有这个正则表达式 ,但是如果分隔符char是斜杠,它总是会失败 :
regex = r"(\s?)(\-?)(f)(\s?)([\+\-\/]?)(\d\d*)(-?)(\d?\d*)"
tmp = re.search(regex, val)[0]
答案 0 :(得分:3)
对于测试数据,可以尝试使用以下正则表达式-(f)-?(\d+)(?:\s*([-+/]\d+))?
。
import re
val = '''
build-f23/24 1st pass demo (50:50)
build-f17-22 1st pass demo (50:50)
build-f-1 +14 1st pass demo (50:50)
build-f45 1st pass demo (50:50)
'''
expected = [['f23-24', 'f23/24'], ['f17-22'], ['f1-14', 'f1+14'], ['f45']]
for m, x in zip(re.findall(r'-(f)-?(\d+)(?:\s*([-+/]\d+))?', val), expected):
result = ''.join(m)
print(result in x, ':', result)
True : f23/24
True : f17-22
True : f1+14
True : f45
答案 1 :(得分:1)
这是一个非常复杂的表达式,我不确定我是否了解比例,但是也许让我们从一个表达式开始输出所需的内容,也许我们可以逐步解决问题:
.+?(-.+?)([a-z][0-9]+?)?\s|(?:[+][0-9])?([0-9]+)?(.+)
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r".+?(-.+?)([a-z][0-9]+?)?\s|(?:[+][0-9])?([0-9]+)?(.+)"
test_str = ("build-f23/24 1st pass demo (50:50)\n"
"build-f17-22 1st pass demo (50:50)\n"
"build-f-1 +14 1st pass demo (50:50)")
subst = "\\1\\2\\3"
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
答案 2 :(得分:1)
import re
dat = """build-f23/24 1st pass demo (50:50)
build-f17-22 1st pass demo (50:50)
build-f-1 +14 1st pass demo (50:50)
build-f45 1st pass demo (50:50)"""
rgx = r'(?mi)^.*(?<=-)(f)\D?(\d+)(?:\s?([+\/-]\d+))?.*$'
re.sub(rgx,r'\1\2\3',dat).split()
['f23/24', 'f17-22', 'f1+14', 'f45']
或者您可以这样做:
rgx1 = r'(?mi)^.*(?<=-)(f)\D?(\d+)(?:\s?[+\/-](\d+))?.*$'
re.sub('(?m)-$','',re.sub(rgx1 ,r'\1\2-\3',dat)).split()
['f23-24', 'f17-22', 'f1-14', 'f45']
或者不用两次使用sub
,您可以直接替换:
re.sub(rgx1,lambda x: f'{x.group(1)}{x.group(2)}-{x.group(3)}'
if x.group(3) else f'{x.group(1)}{x.group(2)}',dat).split()
['f23-24', 'f17-22', 'f1-14', 'f45']