如何从特定列捕获所有数据?

时间:2016-06-14 22:07:25

标签: regex

我想匹配特定列中的所有数据,例如从下面示例中的“test2”列中提取所有信息

common: "mortalkombat_sonia_rules_abc," player: "Mortal Kombat,"    22-May-22 
Test1   Test2   Type1   Type2   Type3   X   Y   HOR1    VER1    Data1    Error1         
r1107   ab-1    abcr0201    222 22  -222    -22 -222    -22 2   2   Testing     
r1106   ab-1    abcr0201    222 22  -222    -22 -222    -22 2   2   Testing     
c377    ab-1    abcf0402    222 2   -222    -22 -222    -22 2   2   Testing     
r632    ab-1    abcd0402    222 22  -222    -22 -222    -22 2   2   Testing

1 个答案:

答案 0 :(得分:3)

描述

^(?!common:)(?:([^\s\n]+)\s+){2}

Regular expression visualization

此正则表达式将执行以下操作:

  • 跳过以common:
  • 开头的第一行
  • 将第二列中的值放入捕获组1中
  • 可扩展,可以通过更改最后{2}中的数字来控制所需的列

实施例

现场演示

https://regex101.com/r/rX1dL1/2

示例文字

common: "mortalkombat_sonia_rules_abc," player: "Mortal Kombat,"    22-May-22 
Test1   Test2   Type1   Type2   Type3   X   Y   HOR1    VER1    Data1    Error1         
r1107   ab-1    abcr0201    222 22  -222    -22 -222    -22 2   2   Testing     
r1106   ab-2    abcr0201    222 22  -222    -22 -222    -22 2   2   Testing     
c377    ab-3    abcf0402    222 2   -222    -22 -222    -22 2   2   Testing     
r632    ab-4    abcd0402    222 22  -222    -22 -222    -22 2   2   Testing

样本匹配

MATCH 1
1.  [87-92] `Test2`

MATCH 2
1.  [176-180]   `ab-1`

MATCH 3
1.  [257-261]   `ab-2`

MATCH 4
1.  [338-342]   `ab-3`

MATCH 5
1.  [419-423]   `ab-4`

解释

NODE                     EXPLANATION
----------------------------------------------------------------------
  ^                        the beginning of a "line"
----------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
----------------------------------------------------------------------
    common:                  'common:'
----------------------------------------------------------------------
  )                        end of look-ahead
----------------------------------------------------------------------
  (?:                      group, but do not capture (2 times):
----------------------------------------------------------------------
    (                        group and capture to \1:
----------------------------------------------------------------------
      [^\s\n]+                 any character except: whitespace (\n,
                               \r, \t, \f, and " "), '\n' (newline)
                               (1 or more times (matching the most
                               amount possible))
----------------------------------------------------------------------
    )                        end of \1
----------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
----------------------------------------------------------------------
  ){2}                     end of grouping
----------------------------------------------------------------------