如何返回由字符串中的空格分隔的distinc字符串重复次数?

时间:2016-01-30 18:02:00

标签: java regex string

给定一个StringX,其中的字符以一种彼此不相关的方式重复,此方法返回StringX中存在这种字符串的不同次数。 E.g String y ="房间里的绿色温度计" """在三个地方重复,绿色的THErmometer在房间里,但只有第一个和最后一个是由白色空间分开的。该方法忽略了第二次出现的THE并返回" 2" 使用强力方法。

代码有一个我无法修复的错误,因为我正在学习java,如果字符串参数的第一个字母用空格缩进,比如"例如"而不是"示例"它给出了一个结果,我甚至无法告诉它是如何产生的。 我们非常感谢简化的有效方法。

    class Test{
    public Test(){
    }

    private int numberOfDistinctOccurence(String string, String token) {
        int tokLength = token.length();
        boolean lastEndsWithSpace, previousIsSpace, nextIsSpace, isFirstSentence;
        boolean isEqual = lastEndsWithSpace = previousIsSpace = nextIsSpace = isFirstSentence = false;
        int count = 0;
        for (int shift = 0, stopCount = 0; stopCount < string.length() - token.length(); 
                                                    stopCount++, shift++, tokLength++) {
            isEqual = (string.substring(shift, tokLength).equalsIgnoreCase(token));
            lastEndsWithSpace = (string.substring(string.length()).equals(" ") || 
                                    (string.substring(string.length()).equals("")));
            if (shift == 1) {
                previousIsSpace = (string.substring(shift - 1, shift).equals(" "));
            }
            nextIsSpace = (string.substring(tokLength, tokLength + 1).equals(" "));
            isFirstSentence = (shift == 0 && string.substring(0, 0).equals("") || nextIsSpace);
            if (isEqual && isFirstSentence) {
                count++;
            } else if (isEqual && nextIsSpace || lastEndsWithSpace && previousIsSpace) {
                count++;
            }
        }
        int x = string.lastIndexOf(token.substring(token.length())); // index of last tokens char
        if (string.substring(x - token.length(), x).equalsIgnoreCase(token)) {
            if (string.length() == token.length() && string.equalsIgnoreCase(token)) {
            } else {
                count = (string.substring(x - token.length() - 1, x - token.length()).
                                            equalsIgnoreCase(" ")) ? count + 1 : count;
            }
        }
        return count = string.length() == token.length() && string.equalsIgnoreCase(token) ? 1 : count;

}
    public static void main(String[] args) {
        Test test = new Test();
        System.out.println(test.numberOfDistinctOccurence("The green Thermometer in the house", "he"));
    }
}

2 个答案:

答案 0 :(得分:0)

要摆脱你的空白问题: 在方法体的头部添加一行:

Foo

并将 string 的所有内容替换为 trimmed 。就是这样。

答案 1 :(得分:0)

只需使用\\b来匹配整个单词:

static int numberOfDistinctOccurrence(String source, String token) {
    Pattern pattern = Pattern.compile("\\s*\\b" + token + "\\b\\s*");
    Matcher matcher = pattern.matcher(source);
    int c = 0;
    while (matcher.find()) {
        c++;
    }
    return c;
}

编辑

它也适用于非\\w字符:

static int numberOfDistinctOccurrence(String source, String token) {
    Pattern pattern = Pattern.compile("(^|\\s*\\b|\\s+)" + Pattern.quote(token) + "(\\b\\s*|\\s+|$)");
    Matcher matcher = pattern.matcher(source);
    int c = 0;
    while (matcher.find()) {
        c++;
    }
    return c;
}

例如:

System.out.println(numberOfDistinctOccurrence("the green thermometer in the room", "the")); // 2
System.out.println(numberOfDistinctOccurrence("the green thermometer in the+ room", "the")); // 2
System.out.println(numberOfDistinctOccurrence("the green thermometer in the+ room", "the+")); // 1

编辑(由于Wiktor Stribiżew的评论)

正则表达式应更改为:

"(\\b|[^\\w])" + Pattern.quote(token) + "(\\b|[^\\w])"