给定一个StringX,其中的字符以一种彼此不相关的方式重复,此方法返回StringX中存在这种字符串的不同次数。 E.g String y ="房间里的绿色温度计" """在三个地方重复,绿色的THErmometer在房间里,但只有第一个和最后一个是由白色空间分开的。该方法忽略了第二次出现的THE并返回" 2" 使用强力方法。
代码有一个我无法修复的错误,因为我正在学习java,如果字符串参数的第一个字母用空格缩进,比如"例如"而不是"示例"它给出了一个结果,我甚至无法告诉它是如何产生的。 我们非常感谢简化的有效方法。
class Test{
public Test(){
}
private int numberOfDistinctOccurence(String string, String token) {
int tokLength = token.length();
boolean lastEndsWithSpace, previousIsSpace, nextIsSpace, isFirstSentence;
boolean isEqual = lastEndsWithSpace = previousIsSpace = nextIsSpace = isFirstSentence = false;
int count = 0;
for (int shift = 0, stopCount = 0; stopCount < string.length() - token.length();
stopCount++, shift++, tokLength++) {
isEqual = (string.substring(shift, tokLength).equalsIgnoreCase(token));
lastEndsWithSpace = (string.substring(string.length()).equals(" ") ||
(string.substring(string.length()).equals("")));
if (shift == 1) {
previousIsSpace = (string.substring(shift - 1, shift).equals(" "));
}
nextIsSpace = (string.substring(tokLength, tokLength + 1).equals(" "));
isFirstSentence = (shift == 0 && string.substring(0, 0).equals("") || nextIsSpace);
if (isEqual && isFirstSentence) {
count++;
} else if (isEqual && nextIsSpace || lastEndsWithSpace && previousIsSpace) {
count++;
}
}
int x = string.lastIndexOf(token.substring(token.length())); // index of last tokens char
if (string.substring(x - token.length(), x).equalsIgnoreCase(token)) {
if (string.length() == token.length() && string.equalsIgnoreCase(token)) {
} else {
count = (string.substring(x - token.length() - 1, x - token.length()).
equalsIgnoreCase(" ")) ? count + 1 : count;
}
}
return count = string.length() == token.length() && string.equalsIgnoreCase(token) ? 1 : count;
}
public static void main(String[] args) {
Test test = new Test();
System.out.println(test.numberOfDistinctOccurence("The green Thermometer in the house", "he"));
}
}
答案 0 :(得分:0)
要摆脱你的空白问题: 在方法体的头部添加一行:
Foo
并将 string 的所有内容替换为 trimmed 。就是这样。
答案 1 :(得分:0)
只需使用\\b
来匹配整个单词:
static int numberOfDistinctOccurrence(String source, String token) {
Pattern pattern = Pattern.compile("\\s*\\b" + token + "\\b\\s*");
Matcher matcher = pattern.matcher(source);
int c = 0;
while (matcher.find()) {
c++;
}
return c;
}
编辑:
它也适用于非\\w
字符:
static int numberOfDistinctOccurrence(String source, String token) {
Pattern pattern = Pattern.compile("(^|\\s*\\b|\\s+)" + Pattern.quote(token) + "(\\b\\s*|\\s+|$)");
Matcher matcher = pattern.matcher(source);
int c = 0;
while (matcher.find()) {
c++;
}
return c;
}
例如:
System.out.println(numberOfDistinctOccurrence("the green thermometer in the room", "the")); // 2
System.out.println(numberOfDistinctOccurrence("the green thermometer in the+ room", "the")); // 2
System.out.println(numberOfDistinctOccurrence("the green thermometer in the+ room", "the+")); // 1
编辑(由于Wiktor Stribiżew的评论):
正则表达式应更改为:
"(\\b|[^\\w])" + Pattern.quote(token) + "(\\b|[^\\w])"