我想在逗号","
分割一个字符串。该字符串包含转义的逗号"\,"
和转义后的反斜杠"\\"
。开头和结尾的逗号以及连续的几个逗号应该会导致空字符串。
因此",,\,\\,,"
应该变为""
,""
,"\,\\"
,""
,""
请注意,我的示例字符串将反斜杠显示为单"\"
。 Java字符串会使它们加倍。
我尝试了几个包但没有成功。我的最后一个想法是编写自己的解析器。
答案 0 :(得分:0)
虽然肯定是一个专门的图书馆是一个好主意,以下将工作
public static String[] splitValues(final String input) {
final ArrayList<String> result = new ArrayList<String>();
// (?:\\\\)* matches any number of \-pairs
// (?<!\\) ensures that the \-pairs aren't preceded by a single \
final Pattern pattern = Pattern.compile("(?<!\\\\)(?:\\\\\\\\)*,");
final Matcher matcher = pattern.matcher(input);
int previous = 0;
while (matcher.find()) {
result.add(input.substring(previous, matcher.end() - 1));
previous = matcher.end();
}
result.add(input.substring(previous, input.length()));
return result.toArray(new String[result.size()]);
}
想法是找到,
前缀为no或偶数\
(即未转义,
),因为,
是模式切割的最后一部分位于end()-1
之前的,
。
除了null
- 输入之外,我可以想到的功能是我能想到的。如果您想更好地处理List<String>
,您当然可以改变回报;我刚刚采用split()
中实现的模式来处理转义。
使用此函数的示例类:
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Print {
public static void main(final String[] args) {
String input = ",,\\,\\\\,,";
final String[] strings = splitValues(input);
System.out.print("\""+input+"\" => ");
printQuoted(strings);
}
public static String[] splitValues(final String input) {
final ArrayList<String> result = new ArrayList<String>();
// (?:\\\\)* matches any number of \-pairs
// (?<!\\) ensures that the \-pairs aren't preceded by a single \
final Pattern pattern = Pattern.compile("(?<!\\\\)(?:\\\\\\\\)*,");
final Matcher matcher = pattern.matcher(input);
int previous = 0;
while (matcher.find()) {
result.add(input.substring(previous, matcher.end() - 1));
previous = matcher.end();
}
result.add(input.substring(previous, input.length()));
return result.toArray(new String[result.size()]);
}
public static void printQuoted(final String[] strings) {
if (strings.length > 0) {
System.out.print("[\"");
System.out.print(strings[0]);
for(int i = 1; i < strings.length; i++) {
System.out.print("\", \"");
System.out.print(strings[i]);
}
System.out.println("\"]");
} else {
System.out.println("[]");
}
}
}
答案 1 :(得分:0)
在这种情况下,自定义功能对我来说听起来更好。试试这个:
public String[] splitEscapedString(String s) {
//Character that won't appear in the string.
//If you are reading lines, '\n' should work fine since it will never appear.
String c = "\n";
StringBuilder sb = new StringBuilder();
for(int i = 0;i<s.length();++i){
if(s.charAt(i)=='\\') {
//If the String is well formatted(all '\' are followed by a character),
//this line should not have problem.
sb.append(s.charAt(++i));
}
else {
if(s.charAt(i) == ',') {
sb.append(c);
}
else {
sb.append(s.charAt(i));
}
}
}
return sb.toString().split(c);
}
答案 2 :(得分:0)
请勿使用.split()
,但要找到(未转义)逗号之间的所有匹配项:
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(
"(?: # Start of group\n" +
" \\\\. # Match either an escaped character\n" +
"| # or\n" +
" [^\\\\,]++ # Match one or more characters except comma/backslash\n" +
")* # Do this any number of times",
Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
结果:["", "", "\\,\\\\", "", ""]
我使用了possessive quantifier(++
),以避免因嵌套量词而导致过度回溯。
答案 3 :(得分:0)
我使用了以下解决方案,用于带引号('和“)和转义(\)字符的通用刺痛分离器。
public static List<String> split(String str, final char splitChar) {
List<String> queries = new ArrayList<>();
int length = str.length();
int start = 0, current = 0;
char ch, quoteChar;
while (current < length) {
ch=str.charAt(current);
// Handle escape char by skipping next char
if(ch == '\\') {
current++;
}else if(ch == '\'' || ch=='"'){ // Handle quoted values
quoteChar = ch;
current++;
while(current < length) {
ch = str.charAt(current);
// Handle escape char by skipping next char
if (ch == '\\') {
current++;
} else if (ch == quoteChar) {
break;
}
current++;
}
}else if(ch == splitChar) { // Split sting
queries.add(str.substring(start, current + 1));
start = current + 1;
}
current++;
}
// Add last value
if (start < current) {
queries.add(str.substring(start));
}
return queries;
}
public static void main(String[] args) {
String str = "abc,x\\,yz,'de,f',\"lm,n\"";
List<String> queries = split(str, ',');
System.out.println("Size: "+queries.size());
for (String query : queries) {
System.out.println(query);
}
}
得到结果
Size: 4
abc,
x\,yz,
'de,f',
"lm,n"