我知道之前已经问过这一千次,但是我无法获得以前的解决方案。我试图在Javascript中使用Regex来解析文本文件。我试图提取的位是货币数字,格式为55,555.00。此处的位数可以在整个文本文件中变化。此外,边界字符和空格可以有所不同。
我写了以下内容以从下面的示例代码中提取我需要的内容:
/((\w\s{10,20})([0-9]{8,}(?=.*[,.]))/g
示例代码:
23205 - Grants Current-County Operatin 4,425,327.00"
" 4 0000047387 Central Equatoria State 1003-1478 Sta Hosp Oper Oct 85,784.00"
" 4 0000047442 EASTERN EQUATORIA ST 1003-1479 Sta Hosp Oper Oct 93,137.00"
" 4 0000047485 JONGLEI STATE 1003-1519 Sta Hosp Oper Oct 144,608.00"
" 4 0000047501 Lakes State 1003-1482 Sta Hosp Oper Oct 93,137.00"
" 4 0000047528 Unity State 1003-1484 Sta Hosp Oper Oct 75,980.00"
" 4 0000047532 Northern Bahr-el State 1003-1483 Sta Hosp Oper Oct 58,824.00"
" 4 0000047615 Western E State 1003-1488 Sta Hosp Oper Oct 93,137.00"
" 4 0000047638 Warap State 1003-1486 Sta Hosp Oper Oct 51,471.00"
" 4 0000047680 Upper Nile State 1003-1485 Sta Hosp Oper Oct 102,941.00"
" 4 0000047703 Western BG State 1003-1487 Sta Hosp Oper Oct 34,314.00"
----------------------
" Total For Period 4 833,333.00"
----------------------------------------------------------------------------------------------------------------------------
Fiscal Year 2015/16 Republic Of South Sudan Date 2015/11/20
Period 5 Time 12:58:40
FreeBalance Financial Management System Page 7
----------------------------------------------------------------------------------------------------------------------------
Vendor Analysis Report
1091 Health (MOH)
Prd Voucher # Vendor Name Description Amount
--- ---------------- ------------------------------ ----------------------------- ----------------------
----------------------
"
以下是一个例子:https://regex101.com/r/nO8nM1/4
问题是领先的边界。我能够排除结束边界(双引号),但我无法摆脱前沿边界。我已经完成了一些工作,但它们包括主表外的两个数字串(在本例中为4,425,327.00和833,333.00)。
非常感谢任何帮助。
答案 0 :(得分:2)
要将浮点值与必填小数点和,
作为数字分组符号匹配,您可以使用
\d+(?:,\d{3})*\.\d+
请参阅demo
<强>解释强>:
\d+
- 一位或多位(?:,\d{3})*
- 0或更多序列
,
- 逗号\d{3}
- 正好是3位数\.
- 文字句点/点\d+
- 一位或多位数。要仅获取在Oct
之后显示的值,您可以使用上述模式和您的模式混合的正则表达式:
\w\s{10,20}(\d+(?:,\d{3})*\.\d+)
请参阅another demo
\w\s{10,20}
匹配字母数字\w
,然后匹配10到20个空白字符,之后只有模式与 匹配并捕获到第1组 浮动值。
请参阅下面的JS代码段(m[1]
是浮点值所在的位置):
var re = /\w\s{10,20}(\d+(?:,\d{3})*\.\d+)/gm;
var str = ' 23205 - Grants Current-County Operatin 4,425,327.00"\n\n" 4 0000047387 Central Equatoria State 1003-1478 Sta Hosp Oper Oct 85,784.00"\n" 4 0000047442 EASTERN EQUATORIA ST 1003-1479 Sta Hosp Oper Oct 93,137.00"\n" 4 0000047485 JONGLEI STATE 1003-1519 Sta Hosp Oper Oct 144,608.00"\n" 4 0000047501 Lakes State 1003-1482 Sta Hosp Oper Oct 93,137.00"\n" 4 0000047528 Unity State 1003-1484 Sta Hosp Oper Oct 75,980.00"\n" 4 0000047532 Northern Bahr-el State 1003-1483 Sta Hosp Oper Oct 58,824.00"\n" 4 0000047615 Western E State 1003-1488 Sta Hosp Oper Oct 93,137.00"\n" 4 0000047638 Warap State 1003-1486 Sta Hosp Oper Oct 51,471.00"\n" 4 0000047680 Upper Nile State 1003-1485 Sta Hosp Oper Oct 102,941.00"\n" 4 0000047703 Western BG State 1003-1487 Sta Hosp Oper Oct 34,314.00"\n ----------------------\n" Total For Period 4 833,333.00"\n ----------------------------------------------------------------------------------------------------------------------------\n Fiscal Year 2015/16 Republic Of South Sudan Date 2015/11/20\n Period 5 Time 12:58:40\n FreeBalance Financial Management System Page 7\n ----------------------------------------------------------------------------------------------------------------------------\n Vendor Analysis Report\n\n 1091 Health (MOH)\n Prd Voucher # Vendor Name Description Amount\n --- ---------------- ------------------------------ ----------------------------- ----------------------\n ----------------------\n" ';
var m;
while ((m = re.exec(str)) !== null) {
document.getElementById("r").innerHTML += m[1] + "<br/>";
}
<div id="r"/>