Question

我试图检测自由文本块中列出的所有网址。我使用.nets Regex.Matches调用..使用以下正则表达式：(http|https)://[^\s "']{4,}

现在，我已经提出以下文字：
    这是一个链接http://somelink.com
    这是一个我没有与你联系的链接：//nospacelink.com/something？something =＆amp; 39358235
    http://nospacelink.com/something?something=&12233454
    here是我已经处理过的链接。     以下是您不允许了解https://somethingbad.com的秘密问题     只是有点烦人我已经把一个新的地址放入了＃39; http://somethinginspeechmarks.com＆＃39; http://postTextLink.com＆＃39;你现在要做什么？     这是一个链接http://alinkwithafullstoplink.com，然后是一些帖子文本     这是一个句号Getting parts of a URL (Regex)的链接。还有一些。

我得到以下输出：

http://somelink.com
http://nospacelink.com?something=&39358235
http://nospacelink.com?something=&12233454
http://alreadyhandledlink.com
https://somethingbad.com
http://somethinginspeechmarks.com
http://postTextLink.com
http://alinkwithafullstoplink.com.

请注意最后一个条目的句号。我怎样才能更新我的正则表达式＆＃34;如果最后有一个句号，请忽略它？＆＃34;

此外，请注意＆＃34; {{3}}＆＃34;与我的问题无关，因为这个问题是关于如何分解特定的URL。我想提取多个完整的网址。请查看我的输入和当前输出以获得澄清！我已经有一个正则表达式，它可以完成我想要的大部分工作，但并不是很正确。你能解释一下我的方法可以改进的地方吗？

Answer 1

我会在模式中添加[^\.]之类的内容。

这种模式表明最后一个字母不能完全停止。

因此，对于(http|https)://[^\s "']{4,}[^\.]，它会尝试匹配所有不是以句号结尾的地址。

编辑：

如评论所述，这个应该更好：[^。\ s＆＃34;＆＃39;]

Answer 2

<强>更新

考虑对您的模式进行以下微小改动......

(http|https)://[^\s "']{4,}(?=\.)

祝你好运！

使用RegEx提取自由文本块中的所有URL

2 个答案: