Rextester

Question

我有以下字符串，并希望提取规则的内容，即我的规则描述看起来像这样：

rule "My Rule Description Looks Like This"      
        followed by some white space other characters such as quotes".

当我使用以下内容时，我得到一个java.lang.StringIndexOutOfBoundsException：字符串索引超出范围：-2：

String ruleName = rule.substring(rule.indexOf("rule \"" + 7, rule.indexOf("\""));

当我使用lastIndexOf时：

String ruleName = rule.substring(rule.indexOf("rule \"" + 7, rule.lastIndexOf("\""));

代码执行正常，但输出如下：

My Rule Description Looks Like This"        
        followed by some white space other characters and quotes

为什么第一个选项使用indexOf抛出异常的任何想法？

Answer 1

对于任何类型的复杂文本提取，您可能需要考虑使用正则表达式。这是一个可以提取规则的简短脚本，它避免了令人讨厌的字符串操作，正如您所见，它可能容易出错。

String line = "rule \"My Rule Description Looks Like This\"\n";
line += "followed by some white space other characters such as quotes\".";
String pattern = "rule\\s+\"(.*?)\".*";

Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
if (m.find()) {
   System.out.println("Found a rule: " + m.group(1) );
} else {
   System.out.println("Could not find a rule.");
}

<强>输出：

My Rule Description Looks Like This

在这里演示：

Rextester

Answer 2

来自the documentation：

public String substring（int beginIndex，int endIndex）

如果beginIndex为负数，或者endIndex大于此String对象的长度，或者beginIndex大于endIndex。

您正在呼叫rule.substring(rule.indexOf("rule \"" + 7, rule.indexOf("\""))。第一个参数为您提供第一个rule + quote的索引，让我们说x，+ 7.第二个参数为您提供第一个引用的索引，即x + 6 （x - rule中的字符数。所以你打电话给substring (x + 7, x +6)，这属于例外情况：

第一个参数biger比第二个。

在你的第二种情况下，使用lastIndexOf，你得到第二个引用，所以你没有遇到这个问题。

Answer 3

indexOf返回指定字符串的第一次出现的索引。

因此，您的第一个示例将尝试从索引7开始子串（0是找到String的索引，然后添加7），并在索引5处结束（第一个“找到”）。

substring(int beginIndex, int endIndex)方法中有一些逻辑，如果从结束索引中减去的开始索引是＆lt; 0 它会抛出StringIndexOutOfBoundsException的值：

int subLen = endIndex - beginIndex;
if (subLen < 0) {
    throw new StringIndexOutOfBoundsException(subLen);
}

你的第二个例子没有抛出异常，但是因为你正在使用lastIndexOf()它会从7子串到String的末尾（其中有一个“）。

最好的解决方案是使用@Tim Biegeleisen的答案中显示的正则表达式模式

使用IndexOf获取带空格和引号的字符串的子串

3 个答案:

Rextester