我正在尝试使用以下规则为ID创建正则表达式:
如果ID是交替的alpha和数字(A-01a1,A1.a.1),则可以>>省略分隔符。如果后续部分是alpha或数字(A-1.1a,A1.2.3,A1a.a),则必需分隔符。
这就是我所拥有的:
(?P<mi>[A-Z]+)-?(?P<si>[0-9]+)[\-\.]?(?P<mc>[a-z0-9])*[\-\.]?(?P<sc>[a-z0-9])*
以下是我尝试时的结果:
ID mi si mc sc
A1 A 1
A001 A 001
AB-01 AB 01
A1aa A 1 a <<<<< mc=aa
A-01a1 A 01 1 <<<<< mc=a sc=1
A-1.1a A 1 a <<<<< mc=1 sc=a
A1.a1 A 1 1 <<<<< mc=a sc=1
A1.a.1 A 1 a 1
A1.2.3 A 1 2 3
A1a.a A 1 a a
答案 0 :(得分:2)
(?<=&|^)xxx=true^(?P<MainID>[a-z]+)-?(?<SubID>[0-9]+)(?:[-.]?(?P<MainCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))?(?:[-.]?(?P<SubCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))?
**要更好地查看图像,只需右键单击图像并在新窗口中选择视图
正则表达式执行以下操作:
可选择a-z或0-9,一次或多次。 (子类别,sc)
如果一组文本被分隔符或字符串末尾包围,则允许字符在同一个捕获组的字母和数字之间交替
如果字符串未被分隔符包围,则允许捕获唯一的字母或数字
现场演示
https://regex101.com/r/uH7zF3/1
示例文字
ID mi si mc sc
A1 A 1
A001 A 001
AB-01 AB 01
A1aa A 1 a <<<<< mc=aa
A-01a1 A 01 1 <<<<< mc=a sc=1
A-1.1a A 1 a <<<<< mc=1 sc=a
A1.a1 A 1 1 <<<<< mc=a sc=1
A1.a.1 A 1 a 1
A1.2.3 A 1 2 3
A1a.a A 1 a a
样本匹配
MATCH 1
MainID [24-25] `A`
SubID [25-26] `1`
MATCH 2
MainID [38-39] `A`
SubID [39-42] `001`
MATCH 3
MainID [54-56] `AB`
SubID [57-59] `01`
MATCH 4
MainID [69-70] `A`
SubID [70-71] `1`
MainCategory [71-73] `aa`
MATCH 5
MainID [104-105] `A`
SubID [106-108] `01`
MainCategory [108-109] `a`
SubCategory [109-110] `1`
MATCH 6
MainID [143-144] `A`
SubID [145-146] `1`
MainCategory [147-149] `1a`
MATCH 7
MainID [182-183] `A`
SubID [183-184] `1`
MainCategory [185-187] `a1`
MATCH 8
MainID [221-222] `A`
SubID [222-223] `1`
MainCategory [224-225] `a`
SubCategory [226-227] `1`
MATCH 9
MainID [243-244] `A`
SubID [244-245] `1`
MainCategory [246-247] `2`
SubCategory [248-249] `3`
MATCH 10
MainID [265-266] `A`
SubID [266-267] `1`
MainCategory [267-268] `a`
SubCategory [269-270] `a`
^ assert position at start of a line
(?P<MainID>[a-z]+) Named capturing group MainID
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
-? matches the character - literally
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
(?<SubID>[0-9]+) Named capturing group SubID
[0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
(?:[-.]?(?P<MainCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))? Non-capturing group
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
[-.]? match a single character present in the list below
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
-. a single character in the list -. literally
(?P<MainCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+) Named capturing group MainCategory
1st Alternative: (?<=[-.])[a-z0-9]+(?=[-.\s])
(?<=[-.]) Positive Lookbehind - Assert that the regex below can be matched
[-.] match a single character present in the list below
-. a single character in the list -. literally
[a-z0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
0-9 a single character in the range between 0 and 9
(?=[-.\s]) Positive Lookahead - Assert that the regex below can be matched
[-.\s] match a single character present in the list below
-. a single character in the list -. literally
\s match any white space character [\r\n\t\f ]
2nd Alternative: [a-z]+
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
3rd Alternative: [0-9]+
[0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
(?:[-.]?(?P<SubCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))? Non-capturing group
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
[-.]? match a single character present in the list below
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
-. a single character in the list -. literally
(?P<SubCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+) Named capturing group SubCategory
1st Alternative: (?<=[-.])[a-z0-9]+(?=[-.\s])
(?<=[-.]) Positive Lookbehind - Assert that the regex below can be matched
[-.] match a single character present in the list below
-. a single character in the list -. literally
[a-z0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
0-9 a single character in the range between 0 and 9
(?=[-.\s]) Positive Lookahead - Assert that the regex below can be matched
[-.\s] match a single character present in the list below
-. a single character in the list -. literally
\s match any white space character [\r\n\t\f ]
2nd Alternative: [a-z]+
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
3rd Alternative: [0-9]+
[0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
答案 1 :(得分:0)
表达式中的children_items.html
应该重新定位到捕获组的内部
您还可以删除字符大小写内的斜杠
*
应该是这样的:
(?P<mi>[A-Z]+)-?(?P<si>[0-9]+)[\-\.]?(?P<mc>[a-z0-9])*[\-\.]?(?P<sc>[a-z0-9])*
^ ^ ^ ^ ^ ^
答案 2 :(得分:0)
我会用这个:
(?<mi>[A-Z]+)-?(?<si>[0-9]+)[-.]?(?<mc>[a-z0-9]*)[-.]?(?<sc>[a-z0-9]*)