正则表达式在两个字符串之间找到字符串,不包括外部字符串

时间:2015-12-17 14:27:06

标签: javascript regex string parsing match

我知道之前已经问过这一千次,但是我无法获得以前的解决方案。我试图在Javascript中使用Regex来解析文本文件。我试图提取的位是货币数字,格式为55,555.00。此处的位数可以在整个文本文件中变化。此外,边界字符和空格可以有所不同。

我写了以下内容以从下面的示例代码中提取我需要的内容:

/((\w\s{10,20})([0-9]{8,}(?=.*[,.]))/g

示例代码:

                  23205        - Grants Current-County Operatin                        4,425,327.00"

"    4   0000047387         Central Equatoria State          1003-1478 Sta Hosp Oper Oct                   85,784.00"
"    4   0000047442         EASTERN EQUATORIA ST             1003-1479 Sta Hosp Oper Oct                   93,137.00"
"    4   0000047485         JONGLEI STATE                    1003-1519 Sta Hosp Oper Oct                  144,608.00"
"    4   0000047501         Lakes State                      1003-1482 Sta Hosp Oper Oct                   93,137.00"
"    4   0000047528         Unity State                      1003-1484 Sta Hosp Oper Oct                   75,980.00"
"    4   0000047532         Northern Bahr-el State           1003-1483 Sta Hosp Oper Oct                   58,824.00"
"    4   0000047615         Western E State                  1003-1488 Sta Hosp Oper Oct                   93,137.00"
"    4   0000047638         Warap State                      1003-1486 Sta Hosp Oper Oct                   51,471.00"
"    4   0000047680         Upper Nile State                 1003-1485 Sta Hosp Oper Oct                  102,941.00"
"    4   0000047703         Western BG State                 1003-1487 Sta Hosp Oper Oct                   34,314.00"
                                                                                             ----------------------
"        Total For Period          4                                                                      833,333.00"
 ----------------------------------------------------------------------------------------------------------------------------
 Fiscal Year        2015/16                               Republic Of South Sudan                         Date     2015/11/20
 Period                   5                                                                               Time       12:58:40
                                                  FreeBalance Financial Management System                 Page              7
 ----------------------------------------------------------------------------------------------------------------------------
                                                            Vendor Analysis Report

                                                              1091 Health (MOH)
  Prd   Voucher #          Vendor Name                      Description                          Amount
  ---   ----------------   ------------------------------   -----------------------------    ----------------------
                                                                                             ----------------------
"  

以下是一个例子:https://regex101.com/r/nO8nM1/4

问题是领先的边界。我能够排除结束边界(双引号),但我无法摆脱前沿边界。我已经完成了一些工作,但它们包括主表外的两个数字串(在本例中为4,425,327.00和833,333.00)。

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:2)

要将浮点值与必填小数点和,作为数字分组符号匹配,您可以使用

\d+(?:,\d{3})*\.\d+

请参阅demo

<强>解释

  • \d+ - 一位或多位
  • (?:,\d{3})* - 0或更多序列
    • , - 逗号
    • \d{3} - 正好是3位数
  • \. - 文字句点/点
  • \d+ - 一位或多位数。

要仅获取Oct 之后显示的值,您可以使用上述模式和您的模式混合的正则表达式:

\w\s{10,20}(\d+(?:,\d{3})*\.\d+)

请参阅another demo

\w\s{10,20}匹配字母数字\w,然后匹配10到20个空白字符,之后只有模式与 匹配并捕获到第1组 浮动值。

请参阅下面的JS代码段(m[1]是浮点值所在的位置):

var re = /\w\s{10,20}(\d+(?:,\d{3})*\.\d+)/gm; 
var str = '                  23205        - Grants Current-County Operatin                        4,425,327.00"\n\n"    4   0000047387         Central Equatoria State          1003-1478 Sta Hosp Oper Oct                   85,784.00"\n"    4   0000047442         EASTERN EQUATORIA ST             1003-1479 Sta Hosp Oper Oct                   93,137.00"\n"    4   0000047485         JONGLEI STATE                    1003-1519 Sta Hosp Oper Oct                  144,608.00"\n"    4   0000047501         Lakes State                      1003-1482 Sta Hosp Oper Oct                   93,137.00"\n"    4   0000047528         Unity State                      1003-1484 Sta Hosp Oper Oct                   75,980.00"\n"    4   0000047532         Northern Bahr-el State           1003-1483 Sta Hosp Oper Oct                   58,824.00"\n"    4   0000047615         Western E State                  1003-1488 Sta Hosp Oper Oct                   93,137.00"\n"    4   0000047638         Warap State                      1003-1486 Sta Hosp Oper Oct                   51,471.00"\n"    4   0000047680         Upper Nile State                 1003-1485 Sta Hosp Oper Oct                  102,941.00"\n"    4   0000047703         Western BG State                 1003-1487 Sta Hosp Oper Oct                   34,314.00"\n                                                                                             ----------------------\n"        Total For Period          4                                                                      833,333.00"\n ----------------------------------------------------------------------------------------------------------------------------\n Fiscal Year        2015/16                               Republic Of South Sudan                         Date     2015/11/20\n Period                   5                                                                               Time       12:58:40\n                                                  FreeBalance Financial Management System                 Page              7\n ----------------------------------------------------------------------------------------------------------------------------\n                                                            Vendor Analysis Report\n\n                                                              1091 Health (MOH)\n  Prd   Voucher #          Vendor Name                      Description                          Amount\n  ---   ----------------   ------------------------------   -----------------------------    ----------------------\n                                                                                             ----------------------\n"  ';
var m;
 
while ((m = re.exec(str)) !== null) {
    document.getElementById("r").innerHTML += m[1] + "<br/>";
}
<div id="r"/>