java正则表达式匹配字符串,包含没有数字的单词,并且可选地用逗号分隔

时间:2013-12-03 09:36:13

标签: java regex

受前一个问题的启发,我试图找到一个匹配字符串的正则表达式,该字符串包含至少一个仅由字符而不是数字组成的单词。所以\ w不适用。只有当连续两行没有逗号时,逗号分隔的单词才可以。

这是我发现的最好的:

(.*\s+,?)*([a-zA-Z]+)+(,?\s+.*)*

但它与以下字符串不匹配:

aaaaa,11111
11111,aaaaa
11111,aaaaa,
,aaaaa
aaaaa,
,aaaaa,
aaaaa,11111,,
,,aaaaa,bbbbb
aaaaa,,bbbbb,ccccc
aaaaa,bbbbb,,ccccc
aaaaa,bbbbb,ccccc
aaaaa,11111

这是一个确定正则表达式是否正确的测试程序:

import java.util.*;
import java.lang.*;
import java.io.*;

class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
    String regex = "(.*\\s+,?)*([a-zA-Z]+)+(,?\\s+.*)*";
    String shouldMatch[] = new String[] {
        "aaaaa",
        "aaaaa bbbbb",
        "aaaaa 11111",
        "11111 aaaaa",
        "aaaaa,11111",
        "aaaaa, 11111",
        "aaaaa,  11111",
        "11111,aaaaa",
        "11111, aaaaa",
        "11111,  aaaaa",
        "11111,aaaaa,",
        ",aaaaa",
        "aaaaa,",
        ",aaaaa,",
        "aaaaa,11111,,",
        ",,aaaaa,bbbbb",
        "aaaaa1111 bbbbb",
        "aaaaa1111 bbbbb ccccc",
        "aaaaa1111bbbbb ccccc",
        "aaaaa11111bbbbb ccccc 22222",
        ",,aaaaa bbbbb",
        "aaaaa,,bbbbb ccccc",
        "aaaaa,,bbbbb,ccccc",
        "aaaaa,bbbbb,,ccccc",
        "aaaaa,bbbbb,ccccc",
        "aaaaa,11111"
    };

    String shouldNotMatch[] = new String[] {
        "aaaaa11111",
        "11111bbbbb",
        "aaaaa11111bbbbb",
        "aaaaa11111bbbbb 11111ccccc",
        "aaaaa11111bbbbb ccccc11111",
        "aaaaa,,bbbbb",
        "aaaaa,,11111",
        ",,aaaaa",
        "aaaaa,,",
        "11111",
        "11111,22222",
        "11111 22222",
        ""
    };

    boolean result = true;

    for(String stringToTest : shouldMatch){
        if (!(stringToTest.matches(regex))){
            System.out.println(stringToTest + " Don't match. WRONG.");
            result = false;
        }
    }

    for(String stringToTest : shouldNotMatch){
        if (stringToTest.matches(regex)){
            System.out.println(stringToTest + " Match. WRONG.");
            result = false;
        }
    }

    if (result){
        System.out.println("Congratulation, your regex is right.");
    }
    else {
        System.out.println("Result of one ore more test is wrong.");
    }
}
}

编辑:添加了一些不应该与正则表达式匹配的字符串,空字符串和数字(​​加上逗号或空格)。

2 个答案:

答案 0 :(得分:2)

这很有效,我查看了你的测试程序:

String regex = "^.*?(?<=\\s|^|,)(?<!,,)[A-Za-z]+(?!,,)(?=\\s|,|$).*$";

Regular expression visualization

^“以”

开头 .*?非贪婪任何非换行符

(?<=\\s|^|,)正面显示空格或字符串的开头或,,因为它们是我们在定义单词之前可以出现的唯一有效字符

(?<!,,) ,,背后的负面看法,因为它们现在在单词
之前被允许 [A-Za-z]+ 1个或多个字母

(?!,,) ,,的负面展望,因为现在在单词

之后允许它们 (?=\\s|,|$)正面显示空格或字符串结尾或,,因为它们是我们定义单词后唯一有效的字符

$“以”

结尾

答案 1 :(得分:1)

根据您的示例,以下工作应该有效:

String regex = "(?i)(?=.*?(?<!,,)\\b[a-z]+\\b(?!,,))[, \\w]+";