Question

Currently I am writing my own BBCode parser. More specifically I am working on the links.

You can click on the link button, and it will insert a link (HTML sbcl --dynamic-space-size 2048) around your text. Then that text will of course be clickable.

If you just normally write a link in the textarea, it will also make that a link automatically. Let me give you some example of what I mean so far:

<a>

Ok now hopefully you still understand what is going on (I know I am confusing). Here in-lies the problem, the regex that changes the BBCODE > HTML is doing it twice.

Before I continue on this let me show you my regexs:

 http://stackoverflow.com 
 Changes Automatically To
 <a href="http://stackoverflow.com">http://stackoverflow.com</a>

 If the user clicks the link button, and inserts it around text:
 [link=http://stackoverflow.com]Directs to SO[/link]
 It will then change that to
 <a href="http://stackoverflow.com">Directs to SO</a>

So basically as you can see, that converts URLS to clickable links. The .replace(/(\[code\][\s\S]*?\[\/code\])|\[(link=)((http|https):\/\/[\S]{0,2000}\.[a-zA-Z]{2,5}(\/[^\[\]\<\>]*)?)\]([\s\S]*?)\[\/link\]/gi, function (m, g1, g2, g3, g4, g5, g6) { return g1 ? g1 : '<a href="' + g3 + '">' + g6 + '</a>'; }) .replace(/(\[code\][\s\S]*?\[\/code\])|((http|https):\/\/[\S]{0,2000}\.[a-zA-Z]{2,5}(\/[^\[\]\<\>]*)?)/gi, function (m, g1, g2, g3, g4, g5) { return g1 ? g1 : '<a href="' + g2 + '">' + g2 + '</a>'; }) is not working though because the second regex is trying to rewrite the first regex again. Let me show you what I mean:

[link=___]

So as you can see it is trying to convert the URLs twice.

How could I make the SECOND regex not change URLS in the [link=http://stackoverflow.com]Directs to SO[/link] First Regex Makes It This: <a href="http://stackoverflow.com">Directs to SO</a> Second Regex Then Makes It This: <a href="<a href="http://stackoverflow.com">Directs to SO</a> tag a second time?

Answer 1

你犯的错误是你试图只使用正则表达式。 Javascript的框中还有其他工具。

不是在文本上运行两次，而是使用单个正则表达式传递一次。完成匹配后，区分这两种情况并采取适当的行动。我提供了一个基于您提供的两个正则表达式的工作示例。

注意：我实际上并不建议使用这个长而丑陋的正则表达式

/* This long regex is just the two regexps you provided, stuck together with an OR in the middle */
var longUglyRegex = /(?:(\[code\][\s\S]*?\[\/code\])|\[(link=)((http|https):\/\/[\S]{0,2000}\.[a-zA-Z]{2,5}(\/[^\[\]\<\>]*)?)\]([\s\S]*?)\[\/link\])|(?:(\[code\][\s\S]*?\[\/code\])|((http|https):\/\/[\S]{0,2000}\.[a-zA-Z]{2,5}(\/[^\[\]\<\>]*)?))/gi;
str.replace(longUglyRegex, function (m, g1, g2, g3, g4, g5, g6, g7, g8) {
    if (g1) {
        return g1;
    }
    var linkText,
    linkLocation;
    if (g6) {
        // BBCode
        linkText = g6;
        linkLocation = g3;
    } else {
        // Plain URL
        linkText = linkLocation = g8;
    }
    return '<a href="' + linkLocation + '">' + linkText + '</a>';
});

实际上，你应该编写一个更加简化的正则表达式，看起来大致如下：

/([BBCODE])?(URL)([/link])?/

注意：这附带标准警告，正则表达式不适合此任务。但在你的情况下，它可能已经足够了。

Fix Up Regex To Not Re-Grab Old URLs From Text (JS)

1 个答案: