Question

我正在尝试使用javascript从黄瓜样本中提取表格的正则表达块。样品黄瓜低于

  | product | currency | price |
  | coffee  | EUR      | 1     |
  | donut   | SEK      | 18    |

正则表达式应该在两个匹配中返回以下内容，如此

1）

  | start | eat | left |
  |  12   |  5  |  7   |
  |  20   |  5  |  15  |

2）

/(\|)[\s\S]*\|(?!\s+\|)/gm

一旦我得到了块，我将逐行拆分以获得表中的行数。在任何情况下，我都试图用负面查找表达式来试图解决这个问题。我的努力低于

| product | currency | price |
      | coffee  | EUR      | 1     |
      | donut   | SEK      | 18    |
      When I buy 1 coffee and 1 donut
      Then should I pay 1 EUR and 18 SEK

   Scenario Outline: eating
      Given there are <start> cucumbers
      When I eat <eat> cucumbers
      Then I should have <left> cucumbers

      Examples:
      | start | eat | left |
      |  12   |  5  |  7   |
      |  20   |  5  |  15  |

然而，

返回

  | product | currency | price |
  | coffee  | EUR      | 1     |
  | donut   | SEK      | 18    |

如果删除第二个场景，则正则表达式按预期工作并仅返回

@market = Market.new(params[:market])

关于我的正则表达式出错的地方的任何建议？非常感谢提前。

Answer 1

[\s\S]*模式匹配任何0+字符，尽可能多，直到字符串中没有1+空格的最后|和|紧靠右边的字符目前的立场。由于从左到右搜索匹配，因此只能获得一次匹配。

我建议像

那样展开模式

/^[^\S\r\n]*\|.*\|(?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)*/gm

请参阅its demo here。

请注意，如果您动态构建它，可能会使其可读：

var h = "[^\\S\r\n]*";     // horizontal whitespace block
var rx = new RegExp("^" +     // start of a line
      h + "\\|.*\\|" +        // hor. whitespace, |, 0+ chars other than line breaks, |
      "(?:" + h + "[\r\n]+" + // 0+ sequences of hor. whitespace, line breaks, 
      h + "\\|.*\\|)*",       // hor. whitespace, |, 0+ chars other than line breaks, |
      "gm"); // Global (find multiple matches) and multiline (^ matches line start)
var s = "Feature: Sample Feature File\r\n\r\n   Scenario: An international coffee shop must handle currencies\r\n      Given the price list for an international coffee shop\r\n      | product | currency | price |\r\n      | coffee  | EUR      | 1     |\r\n      | donut   | SEK      | 18    |\r\n      When I buy 1 coffee and 1 donut\r\n      Then should I pay 1 EUR and 18 SEK\r\n\r\n   Scenario Outline: eating\r\n      Given there are <start> cucumbers\r\n      When I eat <eat> cucumbers\r\n      Then I should have <left> cucumbers\r\n\r\n      Examples:\r\n      | start | eat | left |\r\n      |  12   |  5  |  7   |\r\n      |  20   |  5  |  15  |";
console.log(s.match(rx));

<强>详情

^ - 开始行
[^\S\r\n]* - 0+水平空格
\| - |
.* - 除了换行符之外的任何0 +字符，尽可能多
\| - |
(?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)* - 零个或多个序列：
- [^\S\r\n]* - 0+水平空格
- [\r\n]+ - 一个或多个CR或/和LF符号（如果您只想匹配1个换行符，请在此处使用(?:\r\n?|\n)）
- [^\S\r\n]*\|.*\| - 0+水平空格，|，除了换行符之外的任何0 +字符，尽可能多，|

使用Lookahead在多行上匹配类似模式的JavaScript

1 个答案: