正则表达式找到所有网址,包括png,jpg,gif

时间:2016-06-02 07:52:25

标签: java regex

所以我有下一个代码来过滤掉页面源(String text)

中的所有url(只是http)
private synchronized void addLinks(String text) {

    String regex = "\\b(http)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";

    Pattern urlPattern = Pattern.compile(regex);

    Matcher matcher = urlPattern.matcher(text);
    while(matcher.find()) {

        int matchStart = matcher.start(1);
        int matchEnd = matcher.end();
        String urlStr = text.substring(matchStart, matchEnd);

        //do something
        }
    }
}

我需要在正则表达式中添加一些代码,以便仅匹配链接到某些文本页面的网址。有可能吗?

1 个答案:

答案 0 :(得分:0)

public class NewC{
public static void main(String[] args) throws URISyntaxException {
   String URL_REGEX ="\\b((?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|].[^jpg][^png][^gif]$)";

    Pattern p = Pattern.compile(URL_REGEX);
    Matcher m = p.matcher(args[0]);//replace with string to compare
    if(m.find()) {//myw3schoolsimage
        System.out.println("String contains URL");
    }
}

}