使用Java中的正则表达式查找给定段落中的“wh”句子

时间:2013-01-04 16:33:49

标签: java regex

你叫什么名字?你好吗?我在路上。你在哪?你饿了吗?我喜欢你。

在上一段中,答案应该选择所有wh问题 “你叫什么名字? 你在哪儿?“

如何在java中使用正则表达式实现上述目的?

2 个答案:

答案 0 :(得分:3)

好的,我测试了这段代码,所以现在应该可以了。它会查找我能用英语思考的所有Wh个单词,而不是试图在一个单词中找到Wh本身。

String text = "What is your name? How do you do? I am in way. Where are you? Are you hungry? I like you. What about questions that contain a comma, like this one? Do you like my name, Whitney Houston? What is going to happen now, is you are going to do what I say. Is that clear? What's all this then?";

Pattern p = Pattern.compile("(?:Who|What|When|Where|Why|Which|Whom|Whose)(?:'s)?\\s+[^\\?\\.\\!]+\\?");
Matcher m = p.matcher(text);

List<String> questions = new ArrayList<String>();
while (m.find()) questions.add(m.group());

for (String question : questions) System.out.println(question);

我刚刚意识到可能会有一个问题从Who's开始,所以现在它允许在's字后面Wh

答案 1 :(得分:1)

简单版(对于OP例句)......

    Pattern p = Pattern.compile("Wh[^\\?]*\\?");
    Matcher m = p.matcher(s);
    while (m.find()) {
            System.out.println(m.group());
    }

更高级的匹配(确保Wh字在句子的开头)......

    Pattern p = Pattern.compile("(^|\\?|\\.) *Wh[^\\?]*\\?");
    Matcher m = p.matcher(s);
    while (m.find()) {
            String match = m.group().substring(m.group().indexOf("Wh"));
            System.out.println(match);
    }