在一行中捕获相似的单词

时间:2017-03-23 11:02:59

标签: java regex pattern-matching

我是以下字符串。

What is (Jim)'s gift (limit)? <=> Personname <=> Amount::Spent

在此行中,我想查找并打印()的开始和结束位置。

在我目前的代码中,我能够打印它,但问题是,它会多次打印(我确定这是因为while)。< / p>

我的代码如下。

    String line = "What is (Rakesh)'s gift (limit)? <=> Personname <=> Amount::Spent";
    if (line.contains("<=>")) {
        String[] example_split = line.split("<=>", 2);
        System.out.println("String is " + example_split[1]);
        if (example_split[0].length() > 1) {
            String[] example_entity = example_split[1].split("<=>");

            for (String splitStrings : example_entity) {
                int openParamCount = line.length() - line.replace("(", "").length();
                int closeParamCount = line.length() - line.replace("(", "").length();
                System.out.println(openParamCount + "\t" + closeParamCount);
                if (!(openParamCount == closeParamCount))
                    System.out.println("Paranthesis don't match for " + line);
                if (!(openParamCount == example_entity.length))
                    System.out.println(
                            "The entities provided and the words marked in paranthesis don't match for " + line);

                int entities_count = 0;
                int no_of_entities = example_entity.length;
                Set utterancesSet = new HashSet<>();
                int startPosition = 0;
                int endPosition = 0;
                while (entities_count < no_of_entities) {
                    List<String> matchList = new ArrayList<String>();
                    Pattern regex = Pattern.compile("\\((.*?)\\)");
                    Matcher regexMatcher = regex.matcher(line);
                    while (regexMatcher.find()) {
                        startPosition = regexMatcher.start() + 1;
                        endPosition = regexMatcher.start() - 1;

                        matchList.add(regexMatcher.group(1));
                        System.out.println("start position is " + startPosition + " end position is " + endPosition
                                + " Entity Type" + example_entity[entities_count]);
                    }
                    entities_count++;
                }
            }
        }
    }

预期产出:

String is  Personname <=> Amount::Spent
2   2
start position is 9 end position is 12 Entity Type Personname 
start position is 22 end position is 27 Entity Type Amount::Spent

当前输出

String is  Personname <=> Amount::Spent
2   2
start position is 9 end position is 12 Entity Type Personname 
start position is 22 end position is 27 Entity Type Personname 
start position is 9 end position is 12 Entity Type Amount::Spent
start position is 22 end position is 27 Entity Type Amount::Spent
2   2
start position is 9 end position is 12 Entity Type Personname 
start position is 22 end position is 27 Entity Type Personname 
start position is 9 end position is 12 Entity Type Amount::Spent
start position is 22 end position is 27 Entity Type Amount::Spent

请让我知道我哪里出错了,我该如何解决这个问题。

由于

1 个答案:

答案 0 :(得分:1)

您需要删除2个循环

  1. “for(String splitStrings:example_entity)”
  2. “while(entities_count&lt; no_of_entities)”

  3.     String line = "What is (Rakesh)'s gift (limit)? <=> Personname <=> Amount::Spent";
        if (line.contains("<=>")) {
            String[] example_split = line.split("<=>", 2);
            System.out.println("String is " + example_split[1]);
            if (example_split[0].length() > 1) {
                String[] example_entity = example_split[1].split("<=>");
    
                int openParamCount = line.length() - line.replace("(", "").length();
                int closeParamCount = line.length() - line.replace("(", "").length();
                System.out.println(openParamCount + "\t" + closeParamCount);
                if (!(openParamCount == closeParamCount))
                    System.out.println("Paranthesis don't match for " + line);
                if (!(openParamCount == example_entity.length))
                    System.out.println(
                            "The entities provided and the words marked in paranthesis don't match for " + line);
    
                int entities_count = 0;
                int startPosition;
                int endPosition = 0;
                List<String> matchList = new ArrayList<>();
                Pattern regex = Pattern.compile("\\((.*?)\\)");
                Matcher regexMatcher = regex.matcher(line);
                while (regexMatcher.find()) {
                    startPosition = regexMatcher.start() + 1;
                    endPosition = regexMatcher.start() - 1;
    
                    matchList.add(regexMatcher.group(1));
                    System.out.println("start position is " + startPosition + " end position is " + endPosition
                            + " Entity Type" + example_entity[entities_count]);
                }
                entities_count++;
            }
        }
    

    您的代码虽然暗示括号将始终关闭,但它不允许内部循环的空间,例如

      

    什么是((吉姆)和(凯尔)的礼物(限制)?

    不会返回正确的结果。但如果您希望以该形式输入,这只是一个问题。