我有一个由3rdParty应用程序提供给我的字符串。我想标记它们并使其成为键值对。
name=\"Student one\" grade=\"fifth grade\" gender=m place=\"some place in this earth\" dob=30/02/1900 enrolled
预期的标记化输出
name = \"Student one\"
grade=\"fifth grade\"
gender=m
place=\"some place in this earth\"
dob=30/02/1900
我不能简单地基于空格进行标记,因为\“\”模式中有一些空格,我想省略。
在第二次出现的情况下进行模式匹配也没有用,因为我之间没有性别= m \“\”
如何根据模式进行模式匹配,如果输入位于\“和\”之间,则避免进行模式匹配
答案 0 :(得分:1)
以下内容如何:
(?:\\"[^"\\]*\\"|[^\s\\"])+
在Java中,它可以像这样使用(欢迎来到Java反斜杠地狱):
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(
"(?: # Start of group, matching...\n" +
" \\\\\" # an escaped quote\n" +
" [^\"\\\\]* # followed by 0+ characters except backslashes or quotes\n" +
" \\\\\" # and another escaped quote\n" +
"| # OR\n" +
" [^\\s\\\\\"] # a character except spaces, backslashes or quotes.\n" +
")+ # Repeat as many times as possible (at least once)",
Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
答案 1 :(得分:1)
基本方法如果在双引号内,则使用双引号(&#34;)的开头和结尾来忽略空格。
public static void main(String[] args) {
String data = "name=\"Student one\" grade=\"fifth grade\" gender=m place=\"some place in this earth\" dob=30/02/1900";
ArrayList<String> list = new ArrayList<String>();
String tmp = "";
int alternate=1;
for (int i = 0; i < data.length(); ++i) {
if(data.charAt(i)=='\"'){
alternate*=-1;
}
if(alternate == 1 && (data.charAt(i)==' '||i==data.length()-1)){
list.add(tmp);
tmp="";
}
tmp+=data.charAt(i)+"";
if(tmp.equalsIgnoreCase(" ")){
tmp="";
}
}
Iterator it = list.iterator();
while(it.hasNext()){
System.out.println(it.next().toString());
}
}
输出
name="Student one"
grade="fifth grade"
gender=m
place="some place in this earth"
dob=30/02/190
答案 2 :(得分:1)
你可以尝试这个:
String s = "name= \\\"Student one\\\" grade=\\\"fifth grade\\\" gender=m place=\\\"some place in this earth\\\" dob=30/02/1900 enrolled";
Pattern pattern = Pattern.compile(
"\\S+\\s*=\\s* # Key= with optional spaces around\n"
+"("
+"\\\\\"[^\"\\\\]*\\\\\" # capture in between \"...\" \n"
+"| # OR\n"
+"\\S+ # non space characters!\n"
+")"
, Pattern.COMMENTS);
Matcher m = pattern.matcher(s);
while (m.find( )) {
System.out.println(m.group(0));
}
通常情况如下。为了更好地理解,在正则表达式之间添加注释:
Pattern pattern = Pattern.compile("\\S+\\s*=\\s*(\\\\\"[^\"\\\\]*\\\\\"|\\S+)");