所以我有下一个代码来过滤掉页面源(String text)
中的所有url(只是http)private synchronized void addLinks(String text) {
String regex = "\\b(http)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
Pattern urlPattern = Pattern.compile(regex);
Matcher matcher = urlPattern.matcher(text);
while(matcher.find()) {
int matchStart = matcher.start(1);
int matchEnd = matcher.end();
String urlStr = text.substring(matchStart, matchEnd);
//do something
}
}
}
我需要在正则表达式中添加一些代码,以便仅匹配链接到某些文本页面的网址。有可能吗?
答案 0 :(得分:0)
public class NewC{
public static void main(String[] args) throws URISyntaxException {
String URL_REGEX ="\\b((?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|].[^jpg][^png][^gif]$)";
Pattern p = Pattern.compile(URL_REGEX);
Matcher m = p.matcher(args[0]);//replace with string to compare
if(m.find()) {//myw3schoolsimage
System.out.println("String contains URL");
}
}
}