Question

我正在尝试从一段字符串中提取URL我在其邮件中包含不同的帖子。我准备了一个匹配的模式，但它没有正常工作。

尝试正则表达式

try {
    WebElement element = webDriver.findElementByXPath(btn);
    if(element != null && element.isDisplayed()){
        //do something 
    } else {
        //handle else
    }
} catch (TimeoutException e) {
    //handle else
} catch (NoSuchElementException e) {
    //handle else
}

CODE

try {
    WebElement element = webDriver.findElementByXPath(btn);
    if(element != null && element.isDisplayed()){
        //do something
    } else {
        //handle else
    }
} catch (Exception e) {
    //handle else
}

我的帖子示例

“这只是测试正则表达式提取URL的帖子 http://google.com，https://www.youtube.com/watch?v=dlw32af https://instagram.com/oscar/ en.wikipedia.org“

帖子可能有逗号或多个网址可能没有逗号

谢谢大家：）

Answer 1

这应该让你开始：

\b(?:https?://)?(?:(?i:[a-z]+\.)+)[^\s,]+\b

<小时/> 细分，这说：

\b                   # a word boundary
(?:https?://)?       # http:// or https://, optional
(?:(?i:[a-z]+\.)+)   # any subdomain before
[^\s,]+              # neither whitespace nor comma
\b                   # another word boundary

请参阅a demo on regex101.com。

Answer 2

首先，我分析维基百科的一些URL，它在附加截图中清楚地显示然后写正则表达式！

https:\/\/en.wikipedia.org\/wiki\/(.*)

用于从字符串中提取所有URL的正则表达式

2 个答案: