使用Lookahead在多行上匹配类似模式的JavaScript

时间:2017-09-28 17:03:04

标签: javascript regex

我正在尝试使用javascript从黄瓜样本中提取表格的正则表达块。样品黄瓜低于

  | product | currency | price |
  | coffee  | EUR      | 1     |
  | donut   | SEK      | 18    |

正则表达式应该在两个匹配中返回以下内容,如此

1)

  | start | eat | left |
  |  12   |  5  |  7   |
  |  20   |  5  |  15  |

2)

/(\|)[\s\S]*\|(?!\s+\|)/gm

一旦我得到了块,我将逐行拆分以获得表中的行数。在任何情况下,我都试图用负面查找表达式来试图解决这个问题。我的努力低于

| product | currency | price |
      | coffee  | EUR      | 1     |
      | donut   | SEK      | 18    |
      When I buy 1 coffee and 1 donut
      Then should I pay 1 EUR and 18 SEK

   Scenario Outline: eating
      Given there are <start> cucumbers
      When I eat <eat> cucumbers
      Then I should have <left> cucumbers

      Examples:
      | start | eat | left |
      |  12   |  5  |  7   |
      |  20   |  5  |  15  |
然而,

返回

  | product | currency | price |
  | coffee  | EUR      | 1     |
  | donut   | SEK      | 18    |

如果删除第二个场景,则正则表达式按预期工作并仅返回

@market = Market.new(params[:market])

关于我的正则表达式出错的地方的任何建议?非常感谢提前。

1 个答案:

答案 0 :(得分:1)

[\s\S]*模式匹配任何0+字符,尽可能多,直到字符串中没有1+空格的最后||紧靠右边的字符目前的立场。由于从左到右搜索匹配,因此只能获得一次匹配。

我建议像

那样展开模式
/^[^\S\r\n]*\|.*\|(?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)*/gm

请参阅its demo here

请注意,如果您动态构建它,可能会使其可读:

var h = "[^\\S\r\n]*";     // horizontal whitespace block
var rx = new RegExp("^" +     // start of a line
      h + "\\|.*\\|" +        // hor. whitespace, |, 0+ chars other than line breaks, |
      "(?:" + h + "[\r\n]+" + // 0+ sequences of hor. whitespace, line breaks, 
      h + "\\|.*\\|)*",       // hor. whitespace, |, 0+ chars other than line breaks, |
      "gm"); // Global (find multiple matches) and multiline (^ matches line start)
var s = "Feature: Sample Feature File\r\n\r\n   Scenario: An international coffee shop must handle currencies\r\n      Given the price list for an international coffee shop\r\n      | product | currency | price |\r\n      | coffee  | EUR      | 1     |\r\n      | donut   | SEK      | 18    |\r\n      When I buy 1 coffee and 1 donut\r\n      Then should I pay 1 EUR and 18 SEK\r\n\r\n   Scenario Outline: eating\r\n      Given there are <start> cucumbers\r\n      When I eat <eat> cucumbers\r\n      Then I should have <left> cucumbers\r\n\r\n      Examples:\r\n      | start | eat | left |\r\n      |  12   |  5  |  7   |\r\n      |  20   |  5  |  15  |";
console.log(s.match(rx));

<强>详情

  • ^ - 开始行
  • [^\S\r\n]* - 0+水平空格
  • \| - |
  • .* - 除了换行符之外的任何0 +字符,尽可能多
  • \| - |
  • (?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)* - 零个或多个序列:
    • [^\S\r\n]* - 0+水平空格
    • [\r\n]+ - 一个或多个CR或/和LF符号(如果您只想匹配1个换行符,请在此处使用(?:\r\n?|\n)
    • [^\S\r\n]*\|.*\| - 0+水平空格,|,除了换行符之外的任何0 +字符,尽可能多,|