Question

我有一个当前的正则表达式，用于从txt文件中提取所有链接。

我需要添加的是仅提取域名中的所有网址。

任何人都可以快速帮助解决我需要在此正则表达式中修改的内容来实现此目的吗？

由于

$regex = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i';

还有我可以使用的这个正则表达式，但它仍然需要对它进行相同的添加。

#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#

感谢您的帮助！

Answer 1

preg_match('%https?://(?:www\.)?twitter\.com[^\s]*%i', $subject, $regs)

正则表达式说明：

https?://(?:www\.)?twitter\.com[^\s]*

Options: Case insensitive

Match the character string “http” literally (case insensitive) «http»
Match the character “s” literally (case insensitive) «s?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the character string “://” literally «://»
Match the regular expression below «(?:www\.)?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match the character string “www” literally (case insensitive) «www»
   Match the character “.” literally «\.»
Match the character string “twitter” literally (case insensitive) «twitter»
Match the character “.” literally «\.»
Match the character string “com” literally (case insensitive) «com»
Match a single character that is NOT a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «[^\s]*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»

Regex101 Demo

Answer 2

这样的事情是否有效

\b((https?|ftp|file):\/\/twitter\.com[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$])

我刚刚在以下文字

上运行了它

this is my test string http://www.google.com/sdfsdf.php https://twitter.com/mylink

将域特定信息添加到正则表达式

2 个答案: