使用正则表达式提取在其中找到了searchkeyword的字符串的一部分

时间:2018-10-31 09:20:17

标签: regex

字符串看起来像

最大利用率

A Borrower (or the Parent) may not deliver a Utilisation Request if as a result of the proposed Utilisation:<br/>
[10] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; [or]<br/>
[15] or more Revolving Company Utilisations would be outstanding[; or<br/>
[20] or more Incremental Company Loans would be outstanding].<br/>
A Borrower (or the Parent) may not request that a Company A Loan [or an Incremental Company Loan] be divided if, as a result of the proposed division, [  25      ] or more Company A Loans [or [  50    ] or more Incremental Company Loans] would be outstanding.<br/>
[A Borrower (or the Parent) may not request that a Company B Loan or a Company C Loan be divided.]

预期输出:

[ 10 ] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; 
[ 15 ] or more Revolving Company Utilisations would be outstanding[; or
[ 20 ] or more Incremental Company Loans would be outstanding].

我正在尝试的东西似乎不起作用

Regex = '.*other than Incremental Company Loans.*'

这将返回整个段落。可能还有其他方法,但是我们只能使用REGEX来做到这一点。

1 个答案:

答案 0 :(得分:0)

纯正则表达式方法可能不够用,因为您可能想用换行符进一步替换<br/>,而且模式很复杂:

(?<=^|<br/>)(?:(?!<br/>).)*other than Incremental Company Loans[\s\S]*?(?=[.;]<br/>|$)

请参见regex demo

它匹配:

  • (?<=^|<br/>)-以字符串或<br/>子字符串开头的位置
  • (?:(?!<br/>).)*-出现0个以上的字符,但不以<br/>子字符串开头
  • other than Incremental Company Loans-搜索字符串
  • [\s\S]*?-任意0个以上的字符,尽可能少
  • (?=[.;]<br/>|$)-紧跟着.;紧跟着<br/>或字符串结尾。

在用C#编写代码时,您可以使用易于理解且易于调整的non-regex solution

var s = "A Borrower (or the Parent) may not deliver a Utilisation Request if as a result of the proposed Utilisation:<br/>[10] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; [or]<br/>[15] or more Revolving Company Utilisations would be outstanding[; or<br/>[20] or more Incremental Company Loans would be outstanding].<br/>A Borrower (or the Parent) may not request that a Company A Loan [or an Incremental Company Loan] be divided if, as a result of the proposed division, [  25      ] or more Company A Loans [or [  50    ] or more Incremental Company Loans] would be outstanding.<br/>[A Borrower (or the Parent) may not request that a Company B Loan or a Company C Loan be divided.]";
var result = s.Split(new[] {"<br/>"}, StringSplitOptions.None)
    .SkipWhile(x => !x.Contains("other than Incremental Company Loans"))
    .MagicTakeWhile(x => !x.EndsWith(".") && !x.EndsWith(";"));
Console.WriteLine(string.Join("\n", result));

输出:

[10] or more Term Loans [(other than Incremental Company Loans)] would be outstanding; [or]
[15] or more Revolving Company Utilisations would be outstanding[; or
[20] or more Incremental Company Loans would be outstanding].

TakeWhile, but get the element that stopped it also借用了MagicTakeWhile方法。直到满足条件为止,它需要物品 ,其中最后一个条件不再满足。