用Java自动释义句子

时间:2013-05-18 16:47:39

标签: java regex

在Java中,我正在尝试使用正则表达式自动释义文本。

所以我需要找到一种方法来将正则表达式的第一个匹配替换为该正则表达式的随机生成匹配,如下所示:

public static String paraphraseUsingRegularExpression(String textToParaphrase, String regexToUse){
    //In textToParaphrase, replace the first match of regexToUse with a randomly generated match of regexToUse, and return the modified string.
}

那么如何用随机生成的正则表达式匹配替换字符串中正则表达式的第一个匹配? (也许名为xeger的库对此有用。)

例如,paraphraseUsingRegularExpression("I am very happy today", "(very|extremely) (happy|joyful) (today|at this (moment|time|instant in time))");会将正则表达式的第一个匹配项替换为正则表达式的随机生成匹配,这可能会生成输出"I am extremely joyful at this moment in time""I am very happy at this time"

1 个答案:

答案 0 :(得分:1)

您可以按照以下步骤进行操作:

首先,将textToParaphrase字符串与regexToUse分开,您将获得一个数组,其中textToParaphrase的部分与提供的表达式不匹配。例如:if,

 textToParaphrase = "I am very happy today for you";
 regexToUse = "(very|extremely) (happy|joyful) (today|at this (moment|time|instant in time))";

输出结果为:{"I am ", "for you"}。 然后使用这些生成的字符串(如"(I am |for you)")创建正则表达式。现在再次使用此生成的表达式拆分textToParaphrase,您将获得给定正则表达式的匹配部分的数组。最后,用随机生成的字符串替换每个匹配的部分。

代码如下:

public static String paraphraseUsingRegularExpression(String textToParaphrase, String regexToUse){
    String[] unMatchedPortionArray = textToParaphrase.split(regexToUse);
    String regExToFilter = "(";
    for(int i = 0; i< unMatchedPortionArray.length; i++){
        if(i == unMatchedPortionArray.length -1){
            regExToFilter+=unMatchedPortionArray[i];
        } else {
            regExToFilter+=unMatchedPortionArray[i]+"|";
        }
    }
    regExToFilter+=")";

    String[] matchedPortionArray = textToParaphrase.split(regExToFilter);
    Xeger generator = new Xeger(regexToUse);
    for (String matchedSegment : matchedPortionArray){
    String result = generator.generate(); //generates randomly (according to you!)
        textToParaphrase = textToParaphrase.replace(matchedSegment, result);
    }
    return textToParaphrase;
}

干杯!