我正在尝试使用javascript从黄瓜样本中提取表格的正则表达块。样品黄瓜低于
| product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |
正则表达式应该在两个匹配中返回以下内容,如此
1)
| start | eat | left |
| 12 | 5 | 7 |
| 20 | 5 | 15 |
2)
/(\|)[\s\S]*\|(?!\s+\|)/gm
一旦我得到了块,我将逐行拆分以获得表中的行数。在任何情况下,我都试图用负面查找表达式来试图解决这个问题。我的努力低于
| product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |
When I buy 1 coffee and 1 donut
Then should I pay 1 EUR and 18 SEK
Scenario Outline: eating
Given there are <start> cucumbers
When I eat <eat> cucumbers
Then I should have <left> cucumbers
Examples:
| start | eat | left |
| 12 | 5 | 7 |
| 20 | 5 | 15 |
然而,返回
| product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |
如果删除第二个场景,则正则表达式按预期工作并仅返回
@market = Market.new(params[:market])
关于我的正则表达式出错的地方的任何建议?非常感谢提前。
答案 0 :(得分:1)
[\s\S]*
模式匹配任何0+字符,尽可能多,直到字符串中没有1+空格的最后|
和|
紧靠右边的字符目前的立场。由于从左到右搜索匹配,因此只能获得一次匹配。
我建议像
那样展开模式/^[^\S\r\n]*\|.*\|(?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)*/gm
请参阅its demo here。
请注意,如果您动态构建它,可能会使其可读:
var h = "[^\\S\r\n]*"; // horizontal whitespace block
var rx = new RegExp("^" + // start of a line
h + "\\|.*\\|" + // hor. whitespace, |, 0+ chars other than line breaks, |
"(?:" + h + "[\r\n]+" + // 0+ sequences of hor. whitespace, line breaks,
h + "\\|.*\\|)*", // hor. whitespace, |, 0+ chars other than line breaks, |
"gm"); // Global (find multiple matches) and multiline (^ matches line start)
var s = "Feature: Sample Feature File\r\n\r\n Scenario: An international coffee shop must handle currencies\r\n Given the price list for an international coffee shop\r\n | product | currency | price |\r\n | coffee | EUR | 1 |\r\n | donut | SEK | 18 |\r\n When I buy 1 coffee and 1 donut\r\n Then should I pay 1 EUR and 18 SEK\r\n\r\n Scenario Outline: eating\r\n Given there are <start> cucumbers\r\n When I eat <eat> cucumbers\r\n Then I should have <left> cucumbers\r\n\r\n Examples:\r\n | start | eat | left |\r\n | 12 | 5 | 7 |\r\n | 20 | 5 | 15 |";
console.log(s.match(rx));
<强>详情
^
- 开始行[^\S\r\n]*
- 0+水平空格\|
- |
.*
- 除了换行符之外的任何0 +字符,尽可能多\|
- |
(?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)*
- 零个或多个序列:
[^\S\r\n]*
- 0+水平空格[\r\n]+
- 一个或多个CR或/和LF符号(如果您只想匹配1个换行符,请在此处使用(?:\r\n?|\n)
)[^\S\r\n]*\|.*\|
- 0+水平空格,|
,除了换行符之外的任何0 +字符,尽可能多,|