查找并替换字典,仅使用键一次

时间:2011-12-29 19:50:38

标签: javascript regex search dictionary replace

这是Find multiple keywords within a dictionary

的后续行动

我的问题是......

  1. 第一个是:我相信这与不完整的词匹配。就像我的词典中的短片一样,它很快会与单词匹配。我怎么能阻止这个?

  2. 第二个不那么重要但很好的是:我如何制作它以便每个内容只匹配一次?如此短,不会在同一内容区域内定义两次。

  3. 谢谢!

1 个答案:

答案 0 :(得分:5)

我已实施以下额外要求:

  • 在查找shortly时不匹配short(因为shortly是另一个词)
  • 仅使用字典中的键一次 输入示例:key = foo,replacement = bar,content = foo foo 输出:bar foo(仅替换第一个foo。)

演示:http://jsfiddle.net/bhGE3/3/

用法:

  1. 定义dictionary。每个密钥只能使用一次。
  2. 定义content。将根据此字符串创建一个新字符串。
  3. (可选),定义replacehandler函数。每次比赛都会调用此函数。返回值将用于替换匹配的短语。
    默认replacehandler将返回字典的匹配短语。该函数应该有两个参数:keydictionary
  4. 致电replaceOnceUsingDictionary(dictionary, content, replacehandler)
  5. 处理输出,例如。向用户显示content
  6. 代码:

    var dictionary = {
        "history": "war . ",
        "no": "in a",
        "nothing": "",
        "oops": "",
        "time": "while",
        "there": "We",
        "upon": "in",
        "was": "get involved"
    };
    var content = "Once upon a time... There was no history. Nothing. Oops";
    content = replaceOnceUsingDictionary(dictionary, content, function(key, dictionary){
        return '_' + dictionary[key] + '_';
    });
    alert(content);
    // End of implementation
    
    /*
    * @name        replaceOnceUsingDictionary
    * @author      Rob W http://stackoverflow.com/users/938089/rob-w
    * @description Replaces phrases in a string, based on keys in a given dictionary.
    *               Each key is used only once, and the replacements are case-insensitive
    * @param       Object dictionary  {key: phrase, ...}
    * @param       String content
    * @param       Function replacehandler
    * @returns     Modified string
    */
    function replaceOnceUsingDictionary(dictionary, content, replacehandler) {
        if (typeof replacehandler != "function") {
            // Default replacehandler function.
            replacehandler = function(key, dictionary){
                return dictionary[key];
            }
        }
    
        var patterns = [], // \b is used to mark boundaries "foo" doesn't match food
            patternHash = {},
            oldkey, key, index = 0,
            output = [];
        for (key in dictionary) {
            // Case-insensitivity:
            key = (oldkey = key).toLowerCase();
            dictionary[key] = dictionary[oldkey];
    
            // Sanitize the key, and push it in the list
            patterns.push('\\b(?:' + key.replace(/([[^$.|?*+(){}])/g, '\\$1') + ')\\b');
    
            // Add entry to hash variable, for an optimized backtracking at the next loop
            patternHash[key] = index++;
        }
        var pattern = new RegExp(patterns.join('|'), 'gi'),
            lastIndex = 0;
    
        // We should actually test using !== null, but for foolproofness,
        //  we also reject empty strings
        while (key = pattern.exec(content)) {
            // Case-insensitivity
            key = key[0].toLowerCase();
    
            // Add to output buffer
            output.push(content.substring(lastIndex, pattern.lastIndex - key.length));
            // The next line is the actual replacement method
            output.push(replacehandler(key, dictionary));
    
            // Update lastIndex variable
            lastIndex = pattern.lastIndex;
    
            // Don't match again by removing the matched word, create new pattern
            patterns[patternHash[key]] = '^';
            pattern = new RegExp(patterns.join('|'), 'gi');
    
            // IMPORTANT: Update lastIndex property. Otherwise, enjoy an infinite loop
            pattern.lastIndex = lastIndex;
        }
        output.push(content.substring(lastIndex, content.length));
        return output.join('');
    }