检查文字是否有多个链接

时间:2015-09-11 20:08:57

标签: java regex

我想检查文本是否有多个链接 所以我开始使用以下代码:

private static void twoOrMorelinks(String commentstr){
     String urlPattern = "^.*((?:http|https):\\/\\/\\S+){1,}.*((?:http|https):\\/\\/\\S+){1,}.*$";
     Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(commentstr);
        if (m.find()) {
            System.out.println("yes");
        }
}

但上面的代码不是很专业,我正在寻找以下内容:

private static void twoOrMorelinks(String commentstr){
     String urlPattern = "^.*((?:http|https):\\/\\/\\S+){2,}.*$";
     Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(commentstr);
        if (m.find()) {
            System.out.println("yes");
        }
}

但是这段代码不起作用,例如我希望代码显示匹配以下文本,但它没有:

They say 2's company watch live on...? http://www.le testin this code  http://www.lexilogos.com

任何想法?

2 个答案:

答案 0 :(得分:3)

只需使用它来计算您拥有的链接数量:

private static int countLinks(String str) {
    int total = 0;
    Pattern p = Pattern.compile("(?:http|https):\\/\\/");
    Matcher m = p.matcher(str);
    while (m.find()) {
        total++;
    }
    return total;
}

然后

boolean hasMoreThanTwo = countLinks("They say 2's company watch live on...? http://www.le testin this code  http://www.lexilogos.com") >= 2;

如果您只想知道自己是否有两个或更多,请在找到两个之后退出。

答案 1 :(得分:2)

我建议使用find方法而不是必须检查所有字符串的matches。我重写你的模式以限制回溯量:

String urlPattern = "\\bhttps?://[^h]*+(?:(?:\\Bh|h(?!ttps?://))[^h]*)*+https?://";
Pattern p = Pattern.compile(urlPattern, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(str);
if (m.find()) {
// true
} else {
// false
}

模式细节:

\\b          # word boundary
https?://    # scheme for http or https
[^h]*+       # all that is not an "h"
(?:
    (?:
        \\Bh             # an "h" not preceded by a word boundary
      |                # OR
        h(?!ttps?://)    # an "h" not followed by "ttp://" or "ttps://"
    )
    [^h]*          
)*+
https?://   # an other scheme