Java中的正则表达式需要处理动态字符串值

时间:2013-07-01 17:46:52

标签: java regex

我有一个包含以下值的字符串:

  

TOTAL DUE-STATEMENT $ 240.05911费用$ 10.00FRANCHISE税$ .172VSALES税$ .53LOCAL-TAX $ .23服务折扣 - $ 50.00PAYMENT - 谢谢 - $ 100.00HBO + STARLET $ 100.00

我需要将此字符串拆分为键/值对。

TOTAL DUE-STATEMENT $240.05
911 Fee $10.00
FRANCHISE TAX $.17
2VSALES TAX $.53
LOCAL-TAX $.23
SERVICE DISCOUNT -$50.00
PAYMENT - THANK YOU -$100.00
HBO+STARLET $100.00

我的字符串值将始终是动态的,描述是动态的,除了911 Fee 我写了一个正则表达式如下。

([911 a-zA-Z |911 a-zA-Z|a-zA-Z |a-zA-Z \\-? a-zA-Z|! ?|+? ]+)(-?\\$[0-9|,]*\\.[0-9][0-9])

我正确获取键/值对,除了描述包含数字,字母和特殊字符。我的输出如下:

TOTAL DUE-STATEMENT $240.05
911 Fee $10.00
FRANCHISE TAX $.17
SALES TAX $.53   ** Which is wrong**(Expected is 2VSALES TAX as key)
LOCAL-TAX $.23
SERVICE DISCOUNT -$50.00
PAYMENT - THANK YOU-  $100.00 "-" is coming as key (Expected is PAYMENT - THANK YOU)
STARLET $100.00 **- Which is wrong** (Expected is HBO+STARLET)

有人可以帮助我在这个正则表达式中改变我需要的东西吗?

6 个答案:

答案 0 :(得分:2)

示例:http://regexr.com?35dsq

使用此RegEx

/([-]{0,1}\$\d*\.\d\d)/g

找到$后跟任意数字的数字,然后是.,然后是2位。

然后在替换中使用

 \1\n

答案 1 :(得分:1)

描述

此正则表达式解决方案假设money列有时具有-前缀,但始终包含$后跟零个或多个数字,一个点和正好2个数字。其余的字符是名称的一部分。

([^$]*?)(-?\$\d*\.\d{2})

enter image description here

每个捕获组1都有名称,捕获组2将具有美元值。

示例:

工作示例:http://www.rubular.com/r/9ODCQXyFoZ

示例文字

TOTAL DUE-STATEMENT$240.05911 Fee$10.00FRANCHISE TAX$.172VSALES TAX$.53LOCAL-TAX$.23SERVICE DISCOUNT-$50.00PAYMENT - THANK YOU-$100.00HBO+STARLET$100.00

Java代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
  public static void main(String[] asd){
  String sourcestring = "source string to match with pattern";
  Pattern re = Pattern.compile("([^$]*?)(-?\\$\\d*\\.\\d{2})",Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
  Matcher m = re.matcher(sourcestring);
  int mIdx = 0;
    while (m.find()){
      for( int groupIdx = 0; groupIdx < m.groupCount()+1; groupIdx++ ){
        System.out.println( "[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx));
      }
      mIdx++;
    }
  }
}

捕获论坛

$matches Array:
(
    [0] => Array
        (
            [0] => TOTAL DUE-STATEMENT$240.05
            [1] => 911 Fee$10.00
            [2] => FRANCHISE TAX$.17
            [3] => 2VSALES TAX$.53
            [4] => LOCAL-TAX$.23
            [5] => SERVICE DISCOUNT-$50.00
            [6] => PAYMENT - THANK YOU-$100.00
            [7] => HBO+STARLET$100.00
        )

    [1] => Array
        (
            [0] => TOTAL DUE-STATEMENT
            [1] => 911 Fee
            [2] => FRANCHISE TAX
            [3] => 2VSALES TAX
            [4] => LOCAL-TAX
            [5] => SERVICE DISCOUNT
            [6] => PAYMENT - THANK YOU
            [7] => HBO+STARLET
        )

    [2] => Array
        (
            [0] => $240.05
            [1] => $10.00
            [2] => $.17
            [3] => $.53
            [4] => $.23
            [5] => -$50.00
            [6] => -$100.00
            [7] => $100.00
        )

)

答案 2 :(得分:0)

考虑到总有两个小数位

你的正则表达式可以简化为

.+?[$]\d*[.]\d{2}

你需要匹配模式与上面的regex not split

Matcher m =Pattern.compile(regex).matcher(input);
while(m.find())
{
m.group();
}

答案 3 :(得分:0)

当您的价格格式已知时,请搜索它,其间的所有内容都是描述:

    String in = "TOTAL DUE-STATEMENT$240.05911 Fee$10.00FRANCHISE TAX$.172VSALES TAX$.53LOCAL-TAX$.23SERVICE DISCOUNT-$50.00PAYMENT - THANK YOU-$100.00HBO+STARLET$100.00";
    Pattern price = Pattern.compile("-?\\$\\d*\\.\\d{2}");
    Matcher matcher = price.matcher(in);
    int offset = 0;
    while (matcher.find(offset)) {
        String description = in.substring(offset, matcher.start());
        String value = matcher.group();
        System.out.println(description + " " + value);
        offset = matcher.end();
    }

答案 4 :(得分:0)

class Main {
    public static void main(String[] args) {
        String test = "TOTAL DUE-STATEMENT$240.05911 Fee$10.00FRANCHISE TAX$.172VSALES TAX$.53LOCAL-TAX$.23SERVICE DISCOUNT-$50.00PAYMENT - THANK YOU-$100.00HBO+STARLET$100.00";
        java.util.regex.Pattern p = java.util.regex.Pattern.compile("(?<KEY>.+?(?=-?\\$[\\d,]*\\.\\d{2}))(?<VAL>-?\\$[\\d,]*\\.\\d{2})");
        java.util.regex.Matcher m = p.matcher(test);
        while(m.find()) {
            System.out.println(m.group("KEY") + " : " + m.group("VAL"));
        }
    }
}

你只需要一个非贪婪的KEY匹配。+?然后是一个前瞻性的VALUE,它总是以一个点结束,两个数字为美分。

答案 5 :(得分:-1)

这应该这样做:

^(.+) (-?\$\d*\.\d\d)$

正则表达式的后半部分与美元金额相匹配,包括可选的 - 符号。第一部分除了分隔空间外,还包括其他所有内容。