如何制作正则表达式以查看字符串是否包含某个字母

时间:2013-12-05 22:01:36

标签: java regex contains

在一个网站上,我发现了一些替代品“快速的棕色狐狸跳过懒狗”,我决定写一个小程序来检查替代品是否有效。

对于那些感兴趣的人,我编写了以下程序(使用文件阅读器的this帖子的想法)来检查文件中的句子:

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class TestClass {
    public static void main(String... aArgs)  {
        TestClass tc = new TestClass();
        try {
            String[] pieces = tc.splitFile("/home/user2609980/Desktop/text");
            for (String line : pieces) {
                if (line.contains("a") &&
                        line.contains("b") &&
                        line.contains("c") &&
                        line.contains("d") &&
                        line.contains("e") &&
                        line.contains("f") &&
                        line.contains("g") &&
                        line.contains("h") &&
                        line.contains("i") &&
                        line.contains("j") &&
                        line.contains("k") &&
                        line.contains("l") &&
                        line.contains("m") &&
                        line.contains("n") &&
                        line.contains("o") &&
                        line.contains("p") &&
                        line.contains("q") &&
                        line.contains("r") &&
                        line.contains("s") &&
                        line.contains("t") &&
                        line.contains("u") &&
                        line.contains("v") &&
                        line.contains("w") &&
                        line.contains("x") &&
                        line.contains("y") &&
                        line.contains("z")) {
                    System.out.println("Matches: " + line);
                } else {
                    System.out.println("Does not match: " + line);
                }
            }

        } catch (Exception ex) {
            System.out.println(ex.getMessage());
        }
    }

    public String[] splitFile(String file) throws IOException {

        BufferedReader br = new BufferedReader(new FileReader(file));
        try {
            StringBuilder sb = new StringBuilder();
            String line = br.readLine();

            while (line != null) {
                sb.append(line);
                sb.append('\n');
                line = br.readLine();
            }
            String everything = sb.toString();
            String[] pieces = everything.split("\n");
            return pieces;
        } finally {
            br.close();
        }

    }
}

这就是输出:

Matches: The quick brown fox jumps over the lazy dog
Does not match: Pack my box with five dozen liquor jugs.
Matches: Several fabulous dixieland jazz groups played with quick tempo.
Does not match: Back in my quaint garden, jaunty zinnias vie with flaunting phlox.
Does not match: Five or six big jet planes zoomed quickly by the new tower.
Matches: Exploring the zoo, we saw every kangaroo jump and quite a few carried babies.
Matches: I quickly explained that many big jobs involve few hazards.
Does not match: Jay Wolf is quite an expert on the bass violin, guitar, dulcimer, ukulele and zither.
Matches: Expect skilled signwriters to use many jazzy, quaint old alphabets effectively.
Matches: The wizard quickly jinxed the gnomes before they vaporized.

我想以两种方式改进这个程序。一个,也就是我的问题,是如何制作更有效的代码片段而不是单独检查字母表中的每个字母。我怎样才能做出类似的事情:

line.Contains([regex])

如果可能的话?

奖金问题是我如何制作这个程序,以便打印出不匹配的地方。当然,我可以为每个字母做一个if-else,但我希望有一个更漂亮的方式。

感谢您的关注,我期待着您的回应。

2 个答案:

答案 0 :(得分:3)

我认为最简单的方法是使用这样的循环:

boolean allChars = true;
String uline = line.toUpperCase();
for (char c='A'; c<='Z'; c++) {
    if (uline.indexOf(c) < 0) {
        allChars = false;
        break;
    }
}

即。运行从65(A)到90(Z)的循环,并检查输入字符串中是否存在每个字符。

答案 1 :(得分:0)

这是一个O(n)的解决方案,它试图通过避免循环内的循环来加快速度。但是,性能测试表明在这种情况下这并不值得。

请注意,当我假设您的代码为O(n2)时,我错了。它不是,即使你在循环中有一个循环。那是因为外部循环遍历一个常数(26个字母)

Map<char, boolean> letters = new HashMap<String,boolean>
String uline = line.toUpperCase();

for (int i=0, i < uline.length; i++) {
    letters.put(uline.charAt(i), true );
}


boolean allChars = true;
for (char c='A'; c<='Z'; c++) {
if (letters.get(c) == null) {
   allChars = false;
   break;
}

如果你想要一个代表AND操作的正则表达式,你可以使用正向前瞻断言but I have a feeling it's going to be slow来模仿它。见https://stackoverflow.com/a/470602/227299

(?=.*a)(?=.*b)(?=.*c)(?=.*d)(?=.*e)(?=.*f)(?=.*g)(?=.*h)(?=.*i)(?=.*j)(?=.*k)(?=.*l)(?=.*m)(?=.*n)(?=.*o)(?=.*p)(?=.*q)(?=.*r)(?=.*s)(?=.*t)(?=.*u)(?=.*v)(?=.*w)(?=.*x)(?=.*y)(?=.*z)

请务必使用不区分大小写的修饰符

在行动http://regex101.com/r/yJ4cU6

中查看

I created some performance tests看看使用我建议的方法是否合理,但事实并非如此。我会坚持anubhava的建议。希望答案可以帮助您考虑性能(以及过早优化)。