突出显示<pre> element using regular expressions

时间:2018-03-25 18:41:52

标签: javascript html regex

I am trying to do some simple javascript syntax highlighting in a <pre> element using regular expressions. I am well aware of the fact that javascript is not a regular language and as such can't be fully processed by regular expressions but that is not the point here.

Here is a small piece of that code, just to demonstrate the logic. It is actually working, what I am struggling with is a design of such code.

function highlightCode(node, pattern) {
    var match;
    var pos = 0;

    //clear current content of pre and save it in text variable
    var text = node.textContent;
    node.textContent = '';

    //while there is still something to match
    while (match = pattern.exec(text)) {

        //this is unmatched text, create new text node for it
        var before = text.slice(pos, match.index);
        node.appendChild(document.createTextNode(before));

        //if the match matches the first group append it as a strong tag
        if (group1.test(match[1])) {
            var strong = document.createElement('strong');
            strong.appendChild(document.createTextNode(match[1]));
            node.appendChild(strong);
            pos = pattern.lastIndex;

        //if the match matches the second group append it as i tag
        } else if (group2.test(match[1])) {
            var italic = document.createElement('i');
            italic.appendChild(document.createTextNode(match[1].slice(
                0, match[1].length - 1)));
            node.appendChild(italic);
            pos = pattern.lastIndex - 1;
        }
    }
    //append the remaining text as a regular text
    node.appendChild(document.createTextNode(text.slice(pos)));
}

And here is the rest of the code.

//match function, var, return or any word that is followed by (
var pattern = /(\bfunction\b|\bvar\b|\breturn\b|\b[_a-zA-Z][\w_]*\()/g;

//the same as before just separated into groups
var group1 = /\b(function|var|return)\b/;
var group2 = /\b[_a-zA-Z][\w_]*\(/;

var node = document.querySelector('pre');
highlightCode(node, pattern);

The particular thing that I don't like about this is that at first, I have to match the text against all the possible patterns, and then check each match again individually against separate groups to distinguish between them.

Using this logic, I can't run highlightCode for each group individually because that would erase the previous changes since the content of pre element is recreated at the start and throughout execution of highlightCode.

Moreover, I can't use the general pattern group alone, because I wouldn't know how to distinguish between separate cases. (highlight this code this way and that code that way).

Is there more "correct" approach for this task, using just regular expressions or is this basically it?

0 个答案:

没有答案