需要针对以下场景的Java正则表达式模式:
案例1:
输入字符串:
"a"
匹配:
a
案例2:
输入字符串:
"a b"
匹配:
a b
案例3:
输入字符串:
"aA Bb" cCc 123 4 5 6 7xy "\"z9" "\"z9$^"
匹配:
aA Bb
cCc
123
4
5
6
7xy
"z9
"z9$^
案例4:
输入字符串:
"a b c
匹配
None - since the quotes are unbalanced, hence pattern match fails.
案例5:
输入字符串:
"a b" "c
匹配
None - since the quotes are unbalanced, hence pattern match fails.
案例6:
输入字符串:
"a b" p q r "x y z"
匹配:
a b
p
q
r
x y z
案例7:
输入字符串:
"a b" p q r "x y \"z\""
匹配:
a b
p
q
r
x y "z"
案例8:
输入字符串:
"a b" p q r "x \"y \"z\""
匹配:
a b
p
q
r
x "y "z"
当然,最简单的一个:
案例9:
输入字符串:
a b
匹配:
a
b
尝试使用模式,但似乎不符合以上所有情况。
public List<String> parseArgs(String argStr) {
List<String> params = new ArrayList<String>();
String pattern = "\\s*(\"[^\"]+\"|[^\\s\"]+)";
Pattern quotedParamPattern = Pattern.compile(pattern);
Matcher matcher = quotedParamPattern.matcher(argStr);
while (matcher.find()) {
String param = matcher.group();
System.out.println(param);
params.add(param);
}
return params;
}
public void test(String argStr) {
String[] testStrings = new String[]{"a", "a b", "a b \"c\"", "a b \"c"};
for(String s: testStrings){
parseArgs(s);
}
}
答案 0 :(得分:2)
我不知道用正则表达式解决的直接方法。
但是你可以用一些唯一的关键字替换内部转义序列,然后你可以将你的字符串与正则表达式匹配。
String[] testStrings = new String[]{
"a", "a b", "a b \"c\"", "a b \"c", "\"a b\" p q r \"x y z\""};
Pattern parsingPattern = Pattern.compile("(\".*?\")|( [^ ^\"]+)");
for(String s: testStrings) {
s=s.replace("(?<!\\)\\"","@@@");
}
for(String s: testStrings) {
List<String> params = null;
int count = StringUtils.countMatches(s, "\"");
if(count%2==0){
params = new ArrayList<String>();
Matcher matcher = parsePattern.matcher(s);
while (matcher.find())
params.add( matcher.group(1) != null ? matcher.group(1) : matcher.group(2));
}
}
获得匹配后,您可以使用实际关键字替换您的唯一标识符..
我还没有对代码段进行测试,但我希望您可以做一些小调整以使其正常工作。
答案 1 :(得分:1)
写了一个“CLIParser”课程,它会给你结果。
//instantiate the CLIParser
CLIParser parser = new CLIParser("\"a b\" p q r \"x y z\"");
//call the method getTokens which gives you the result.
ArrayList<String> resultTokens = parser.getTokens();
###################CLI Parser Class definition#################################
class CLIParser {
private String cmdString;
public CLIParser(String cmdString) {
this.cmdString = cmdString;
}
public ArrayList<String> getTokens() throws Exception {
ArrayList<String> finalTokens = new ArrayList<String>();
ArrayList<StringBuffer> tokens = new ArrayList<StringBuffer>();
char inArray[] = this.cmdString.toCharArray();
StringBuffer token = new StringBuffer();
int valid = checkIfTheStringIsValid(inArray);
if (valid == -1) {
for (int i = 0; i <= inArray.length; i++) {
if (i != inArray.length) {
if ((inArray[i] != ' ') && (inArray[i] != '"')) {
token.append(inArray[i]);
}
if ((inArray[i] == '"') && (inArray[i - 1] != '\\')) {
i = i + 1;
while (checkIfLastQuote(inArray, i)) {
token.append(inArray[i]);
i++;
}
}
}
if (i == inArray.length) {
tokens.add(token);
token = new StringBuffer();
} else if (inArray[i] == ' ' && inArray[i] != '"') {
tokens.add(token);
token = new StringBuffer();
}
}
} else {
throw new InvalidCommandException(
"Invalid command. Couldn't identify sequence at position "
+ valid);
}
for(StringBuffer tok:tokens){
finalTokens.add(tok.toString());
}
return finalTokens;
}
private static int checkIfTheStringIsValid(char[] inArray) {
Stack myStack = new Stack<Character>();
int pos = 0;
for (int i = 0; i < inArray.length; i++) {
if (inArray[i] == '"' && inArray[i - 1] != '\\') {
pos = i;
if (myStack.isEmpty())
myStack.push(inArray[i]);
else
myStack.pop();
}
}
if (myStack.isEmpty())
return -1;
else
return pos;
}
private static boolean checkIfLastQuote(char inArray[], int i) {
if (inArray[i] == '"') {
if (inArray[i - 1] == '\\') {
return true;
} else
return false;
} else
return true;
}
}
答案 2 :(得分:0)
尝试一下:
("\S+?(?: \S+?)*"|\S+?)
查看实际操作:http://regex101.com/r/fA5hN0
只需运行全局匹配并返回\1
。返回的每个捕获组都应该包含您想要的内容。
答案 3 :(得分:0)
为了帮助您入门,您可以使用这个基于Java regex的代码:
public List<String> parseArgs(String argStr, Pattern validPattern, Pattern parsePattern) {
List<String> params = null;
if (validPattern.matcher(argStr).matches()) {
params = new ArrayList<String>();
Matcher matcher = parsePattern.matcher(argStr);
while (matcher.find())
params.add( matcher.group(1) != null ? matcher.group(1) : matcher.group(2));
}
return params;
}
public void parseIt() {
Pattern validatePattern = Pattern.compile("^(?=(?:(?:[^\"]*\"){2})*[^\"]*$).*$");
Pattern parsingPattern = Pattern.compile("\"([^\"]*)\"|(\\w+)");
String[] testStrings = new String[]{
"a", "a b", "a b \"c\"", "a b \"c", "\"a b\" p q r \"x y z\""};
for(String s: testStrings) {
List<String> parsedList = parseArgs(s, validatePattern, parsingPattern);
System.out.printf("input: %-30s :: parsed: %s%n", s, parsedList);
}
}
input: a :: parsed: [a]
input: a b :: parsed: [a, b]
input: a b "c" :: parsed: [a, b, c]
input: a b "c :: parsed: null
input: "a b" p q r "x y z" :: parsed: [a b, p, q, r, x y z]
PS:虽然我已经注意到您在最近的编辑中添加了嵌套引号,但这个答案需要加强。