正则表达式需要包含1个单词

时间:2011-10-31 21:38:22

标签: java regex

我有大约2000个关于死亡事件的判决,我想根据原因对其进行过滤。 首先,我想从这些开始:

______ fell (*) ______ 

to the
off the
from the 

其中______是一个1字的组,(*) 来自强>

我试过

(\w*)fell+\s+to\sthe|off\sthe|from\sthe(\w*)

它会返回“off the”等,但它不会显示单词是否存在。 (这些团体当时可能都不工作。)

所以出了什么问题,我确实使用fell+所以应该在那里失败一次吗?

2 个答案:

答案 0 :(得分:1)

您需要在替换选项中使用括号:

(\w*)fell\s(to\sthe|off\sthe|from\sthe)(\w*)

为避免捕获群组,请使用(?: ... )

(\w*)fell\s(?:to\sthe|off\sthe|from\sthe)(\w*)

答案 1 :(得分:0)

我会选择(\\w*)fell\\s[to|off|from\\sthe]\\s*(\\w*)

这是一个小例子:

import java.util.regex.*;
class rtest { 
    static String regex = "(\\w*)fell\\s[to|off|from\\sthe]\\s*(\\w*)";
    static Pattern pattern = Pattern.compile(regex);

    public static void main(String[] args) {
        process("Bob fell off the bike");
        process("Matt fell to the bottom");
        process("I think Terry fell from the beat of a different drum");
    }
    static void process(String text) {
        System.out.println(text);
        String[] tokens = text.split(regex);
        for(String t : tokens) System.out.println(t);
        System.out.println(" ");
    }
}

结果:

C:\Documents and Settings\glowcoder\My Documents>javac rtest.java

C:\Documents and Settings\glowcoder\My Documents>java rtest
Bob fell off the bike
Bob
 the bike

Matt fell to the bottom
Matt
 the bottom

I think Terry fell from the beat of a different drum
I think Terry
 the beat of a different drum