RegEx用于以下字符串

时间:2016-06-29 04:26:40

标签: regex

我正在尝试使用以下规则为ID创建正则表达式:

  1. 以A-Z开始,一次或多次。 (主要ID,mi)
  2. 后面跟一个可选的短划线。 (定界符)
  3. 跟随0-9,一次或多次。 (Sub ID,si)
  4. 后面有一个可选的短划线或点。 (定界符)
  5. 使用可选的a-z或0-9,一次或多次。 (主要类别,mc)
  6. 后面有一个可选的短划线或点。 (定界符)
  7. 使用可选的a-z或0-9,一次或多次。 (子类别,sc)
  8. 如果ID是交替的alpha和数字(A-01a1,A1.a.1),则可以>省略分隔符。如果后续部分是alpha或数字(A-1.1a,A1.2.3,A1a.a),则必需分隔符。

    这就是我所拥有的:

    (?P<mi>[A-Z]+)-?(?P<si>[0-9]+)[\-\.]?(?P<mc>[a-z0-9])*[\-\.]?(?P<sc>[a-z0-9])*
    

    以下是我尝试时的结果:

    ID      mi  si  mc  sc
    A1      A   1
    A001    A   001
    AB-01   AB  01
    A1aa    A   1   a      <<<<< mc=aa
    A-01a1  A   01  1      <<<<< mc=a sc=1
    A-1.1a  A   1   a      <<<<< mc=1 sc=a
    A1.a1   A   1   1      <<<<< mc=a sc=1
    A1.a.1  A   1   a   1
    A1.2.3  A   1   2   3
    A1a.a   A   1   a   a
    

3 个答案:

答案 0 :(得分:2)

描述

(?<=&|^)xxx=true^(?P<MainID>[a-z]+)-?(?<SubID>[0-9]+)(?:[-.]?(?P<MainCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))?(?:[-.]?(?P<SubCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))?

Regular expression visualization

**要更好地查看图像,只需右键单击图像并在新窗口中选择视图

正则表达式执行以下操作:

  • 以A-Z开始,一次或多次。 (主要ID,mi)
  • 后面跟一个可选的短划线。 (定界符)
  • 跟随0-9,一次或多次。 (Sub ID,si)
  • 后面有一个可选的短划线或点。 (定界符)
  • 使用可选的a-z或0-9,一次或多次。 (主要类别,mc)
  • 后面有一个可选的短划线或点。 (定界符)
  • 可选择a-z或0-9,一次或多次。 (子类别,sc)

  • 如果一组文本被分隔符或字符串末尾包围,则允许字符在同一个捕获组的字母和数字之间交替

  • 如果字符串未被分隔符包围,则允许捕获唯一的字母或数字

实施例

现场演示

https://regex101.com/r/uH7zF3/1

示例文字

ID      mi  si  mc  sc
A1      A   1
A001    A   001
AB-01   AB  01
A1aa    A   1   a      <<<<< mc=aa
A-01a1  A   01  1      <<<<< mc=a sc=1
A-1.1a  A   1   a      <<<<< mc=1 sc=a
A1.a1   A   1   1      <<<<< mc=a sc=1
A1.a.1  A   1   a   1
A1.2.3  A   1   2   3
A1a.a   A   1   a   a

样本匹配

MATCH 1
MainID  [24-25] `A`
SubID   [25-26] `1`

MATCH 2
MainID  [38-39] `A`
SubID   [39-42] `001`

MATCH 3
MainID  [54-56] `AB`
SubID   [57-59] `01`

MATCH 4
MainID  [69-70] `A`
SubID   [70-71] `1`
MainCategory    [71-73] `aa`

MATCH 5
MainID  [104-105]   `A`
SubID   [106-108]   `01`
MainCategory    [108-109]   `a`
SubCategory [109-110]   `1`

MATCH 6
MainID  [143-144]   `A`
SubID   [145-146]   `1`
MainCategory    [147-149]   `1a`

MATCH 7
MainID  [182-183]   `A`
SubID   [183-184]   `1`
MainCategory    [185-187]   `a1`

MATCH 8
MainID  [221-222]   `A`
SubID   [222-223]   `1`
MainCategory    [224-225]   `a`
SubCategory [226-227]   `1`

MATCH 9
MainID  [243-244]   `A`
SubID   [244-245]   `1`
MainCategory    [246-247]   `2`
SubCategory [248-249]   `3`

MATCH 10
MainID  [265-266]   `A`
SubID   [266-267]   `1`
MainCategory    [267-268]   `a`
SubCategory [269-270]   `a`

解释

^ assert position at start of a line
(?P<MainID>[a-z]+) Named capturing group MainID
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
-? matches the character - literally
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
(?<SubID>[0-9]+) Named capturing group SubID
[0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
(?:[-.]?(?P<MainCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))? Non-capturing group
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
[-.]? match a single character present in the list below
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
-. a single character in the list -. literally
(?P<MainCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+) Named capturing group MainCategory
1st Alternative: (?<=[-.])[a-z0-9]+(?=[-.\s])
(?<=[-.]) Positive Lookbehind - Assert that the regex below can be matched
[-.] match a single character present in the list below
-. a single character in the list -. literally
[a-z0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
0-9 a single character in the range between 0 and 9
(?=[-.\s]) Positive Lookahead - Assert that the regex below can be matched
[-.\s] match a single character present in the list below
-. a single character in the list -. literally
\s match any white space character [\r\n\t\f ]
2nd Alternative: [a-z]+
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
3rd Alternative: [0-9]+
[0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
(?:[-.]?(?P<SubCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+))? Non-capturing group
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
[-.]? match a single character present in the list below
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
-. a single character in the list -. literally
(?P<SubCategory>(?<=[-.])[a-z0-9]+(?=[-.\s])|[a-z]+|[0-9]+) Named capturing group SubCategory
1st Alternative: (?<=[-.])[a-z0-9]+(?=[-.\s])
(?<=[-.]) Positive Lookbehind - Assert that the regex below can be matched
[-.] match a single character present in the list below
-. a single character in the list -. literally
[a-z0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
0-9 a single character in the range between 0 and 9
(?=[-.\s]) Positive Lookahead - Assert that the regex below can be matched
[-.\s] match a single character present in the list below
-. a single character in the list -. literally
\s match any white space character [\r\n\t\f ]
2nd Alternative: [a-z]+
[a-z]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case insensitive)
3rd Alternative: [0-9]+
[0-9]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9

答案 1 :(得分:0)

表达式中的children_items.html应该重新定位到捕获组的内部

您还可以删除字符大小写内的斜杠

*

应该是这样的:

(?P<mi>[A-Z]+)-?(?P<si>[0-9]+)[\-\.]?(?P<mc>[a-z0-9])*[\-\.]?(?P<sc>[a-z0-9])*
                               ^ ^                   ^ ^ ^                   ^ 

答案 2 :(得分:0)

我会用这个:

(?<mi>[A-Z]+)-?(?<si>[0-9]+)[-.]?(?<mc>[a-z0-9]*)[-.]?(?<sc>[a-z0-9]*)