仅在c#中出现多行和空格正则表达式问题

时间:2017-01-17 11:16:41

标签: c# regex

这段代码有什么问题?我的目标是在文本中找到URL并将它们放入超链接标记中。我正在使用旧的diegoperini正则表达式(信用https://mathiasbynens.be/demo/url-regex)。 如果我的输入是一行并且最后没有空间,那么一切顺利。

https://regex101.com/页面显示此正则表达式有效(全局和多重检查),但c#找不到任何内容。

class Program
    {
        static void Main(string[] args)
        {
            String sourcestring = @"Tralala

bla bla bla

https://iqesonline.lt/index.cfm
bla bla bla bla
https://iqesonline.lt/index.cfm?id=99061c04-441e-a138-8254-6c441f7f59b5

ulala.

trampampam";
            // WORKS sourcestring = "https://iqesonline.lt/index.cfm?id=98061c04-441e-a138-8254-6c441f7f59b5";
            // DOES NOT WORK sourcestring = "https://iqesonline.lt/index.cfm?id=98061c04-441e-a138-8254-6c441f7f59b5 ";
            ParseLinksToHtml(sourcestring);


        }
        public static string ParseLinksToHtml(string tekstas)
        {

            string result = tekstas;
            if (!string.IsNullOrEmpty(result))
            {
                // NOT WORKING Regex rx = new Regex(@"^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$");
                //NOT WORKING TOO 
                Regex rx = new Regex(@"^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$", RegexOptions.Multiline | RegexOptions.IgnoreCase);
                result = rx.Replace(result, delegate (Match match)
                {
                    string url = match.ToString();
                    if (url.ToLower().StartsWith("www."))
                    {
                        url = "http://" + url;
                    }
                    return string.Format("<a href=\"{0}\" target=\"_blank\">{1}</a>", url, match.ToString());
                });
            }
            return result;
        }


    }

2 个答案:

答案 0 :(得分:0)

使用以下简化的正则表达式字符串进行测试:

@"(?:\w+):\/\/(?:[\w@][\w.:@]+)\/?[\w\.?=%&=\-@/$,]*"

产地:

Tralala

bla bla bla

<a href="https://iqesonline.lt/index.cfm" target="_blank">https://iqesonline.lt/index.cfm</a>
bla bla bla bla
<a href="https://iqesonline.lt/index.cfm?id=99061c04-441e-a138-8254-6c441f7f59b5" target="_blank">https://iqesonline.lt/index.cfm?id=99061c04-441e-a138-8254-6c441f7f59b5</a>

ulala.

trampampam

答案 1 :(得分:0)

正确答案是这样的:删除&#34; ^&#34;和&#34; $&#34;使它工作。答案属于磁控管。谢谢。

Regex rx = new Regex(@"^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$", RegexOptions.Multiline | RegexOptions.IgnoreCase);