Javascript正则表达式从句子中检索变量

时间:2012-06-27 18:42:27

标签: javascript regex nlp

我想知道是否可以使用正则表达式或类似方法从预定义句子中提取大量变量。

e.g。

如果是这种模式......

"How many * awards did * win in *?"

有人打字......

"How many gold awards did johnny win in 2008?"

我怎么能以某种方式回归...

["gold","johnny","2008"]

我还想在检索变量之前返回它与模式匹配的事实,因为会有许多不同的模式。注意:某人也可以输入多个单词来代替*,例如 johnny english 而不只是 johnny

由于

2 个答案:

答案 0 :(得分:3)

var text = "How many gold awards did johnny win in 2008?";
var query = text.match(/^How many ([^\s]+) awards did ([^\s]+) win in ([^\s]+)\?$/i);
query.splice(0,1); //remove the first item since you will not need it
query[0] //gold
query[1] //johny
query[2] //2008

有关详细信息,请参阅MDN - match

更新

好像你想要匹配johny english中的How many gold awards did johnny english win in 2008? 这是正则表达式的更新版本:

/^How many (.+) awards did (.+) win in (.+)\?$/i

答案 1 :(得分:1)

基于Derek的回答和SimpleCoder的评论,这里将是完整的功能:

// This function escapes a regex string
// http://simonwillison.net/2006/jan/20/escape/
function escapeRegex(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}

function match(pattern, text) {
    var regex = '^' + escapeRegex(pattern).replace(/\\\*/g, '(.+)') + '$';
    var query = text.match(new RegExp(regex, 'i'));
    if (!query)
        return false;

    query.shift(); // remove first element
    return query;
}

match("How many * awards did * win in *?", "How many gold awards did johnny win in 2008?");