我非常感谢Java代码的一些帮助,以分割以下输入:
word1 key="value with space" word3 -> [ "word1", "key=\"value with space\"", "word3" ]
word1 "word2 with space" word3 -> [ "word1", "word2 with space", "word3" ]
word1 word2 word3 -> [ "word1" , "word2", "word3" ]
第一个样本输入是艰难的。第二个单词在字符串中间有引号,而不是在开头。我找到了几种处理中间例子的方法,如Split string on spaces in Java, except if between quotes (i.e. treat \"hello world\" as one token)
中所述答案 0 :(得分:1)
您可以对字符串进行简单的迭代,而不是使用正则表达式:
public static String[] splitWords(String str) {
List<String> array = new ArrayList<>();
boolean inQuote = false; // Marker telling us if we are between quotes
int previousStart = -1; // The index of the beginning of the last word
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
if (Character.isWhitespace(c)) {
if (previousStart != -1 && !inQuote) {
// end of word
array.add(str.substring(previousStart, i));
previousStart = -1;
}
} else {
// possibly new word
if (previousStart == -1) previousStart = i;
// toggle state of quote
if (c == '"')
inQuote = !inQuote;
}
}
// Add last segment if there is one
if (previousStart != -1)
array.add(str.substring(previousStart));
return array.toArray(new String [array.size()]);
}
此方法的优点是能够根据需要正确识别空间附近的引号。例如,以下是单个字符串:
a"b c"d"e f"g
答案 1 :(得分:0)
这可以通过混合使用正则表达式和替换来完成。只需找到首先用引号括起来的文本,然后用非空格替换。然后,您可以根据空格拆分字符串并替换回密钥文本。
String s1 = "word1 key=\"value with space\" word3";
List<String> list = new ArrayList<String>();
Matcher m = Pattern.compile("\"([^\"]*)\"").matcher(s1);
while (m.find())
s1 = s1.replace(m.group(1), m.group(1).replace(" ", "||")); // replaces the spaces between quotes with ||
for(String s : s1.split(" ")) {
list.add(s.replace("||", " ")); // switch back the text to a space.
System.out.println(s.replace("||", " ")); // just to see output
}
答案 2 :(得分:0)
可以通过在正则表达式中使用前瞻来完成拆分:
String[] words = input.split(" +(?=(([^\"]*\"){2})*[^\"]*$)");
这是一些测试代码:
String[] inputs = { "word1 key=\"value with space\" word3","word1 \"word2 with space\" word3", "word1 word2 word3"};
for (String input : inputs) {
String[] words = input.split(" +(?=(([^\"]*\"){2})*[^\"]*$)");
System.out.println(Arrays.toString(words));
}
输出继电器:
[word1, key="value with space", word3]
[word1, "word2 with space", word3]
[word1, word2, word3]