Question

我正在尝试使用scanner.next（Pattern p）方法挑选出一个大字文件的部分，该文件以bob开头，以jim结尾。例如：

hello hello jimbob jimhellohellobob hellojim hellobob

将next()三次，返回"jimbob"，"jimhellohellobob"和"jim hellobob"

但最好不是"jimbob jimhellohellobob hellojim hellobob"，即它在开头和结尾之间的允许文本中排除了'jim'这个词。

我吮吸Regex，更不用说Java正则表达式，所以我运气不好。这就是我现在所处的位置：

String test = "hello hello jimbob jimhellohellobob hellojim hellobob ";


    Pattern p = Pattern.compile(".*jim.*bob.*");
    Scanner s = new Scanner(test);
    String temp;

    while(s.hasNext(p)){
        temp = s.next(p);
        System.out.println(temp);
    }

这不打印任何东西。我出错的任何想法？

Answer 1

您使用的是错误的课程。要查找所有匹配项或正则表达式，您需要使用Matcher及其find方法。此外，由于.*在开始和结束时您当前的正则表达式接受包含 jim和bob的任何字符串。同样.*也是贪婪的，因此hello jimbob hello bob模式jim.*bob*等数据将匹配jimbob hello bob而非jimbob部分。要制作.* reluctant，您需要在?之后添加.*?。

所以你的代码应该更像

Pattern p = Pattern.compile("jim.*?bob"); //depending on what you want you may 
                                          //also need to add word boundary `\\b`
Matcher m = p.matcher(yourText);
while(m.find()){
    System.out.println(m.group());
}

Java Regex，匹配以x开头的段落，以y

1 个答案: