如何在URL中验证和修复无效的斜杠数?

时间:2015-11-20 13:50:08

标签: regex url java-8

我的代码无法按预期工作。我们的想法是将一个组与来自url的斜杠相匹配。斜杠数应为1或更多。该算法应该只用两个替换任意数量的斜杠。如何修复代码?

HttpURLConverter

public class HttpURLConverter {

    final private String UrlPattern = "((([A-Za-z]{3,9}:(?:\\/\\/)?)(?:[\\-;:&=\\+\\$,\\w]+@)?[A-Za-z0-9\\.\\-]+|(?:www\\.|[\\-;:&=\\+\\$,\\w]+@)[A-Za-z0-9\\.\\-]+)((?:\\/[\\+~%\\/\\.\\w\\-_]*)?\\??(?:[\\-\\+=&;%@\\.\\w_]*)#?(?:[\\.\\!\\/\\\\\\w]*))?)";

    URL validateURL(URL url) throws MalformedURLException {
        URL validURL = null;
        if(!Pattern.matches(UrlPattern, url.toString())){
            if(Pattern.matches("(https?|ftp|file):.*", url.toString())){
                Matcher matcher = Pattern.compile("(https?|ftp|file):(\\/)*([A-za-z0-9\\.\\-?#_]+)([A-za-z0-9\\.\\-?#_\\/]{0,})", Pattern.CASE_INSENSITIVE).matcher(url.toString());

                List<String> allMatches = new ArrayList<String>();
                while (matcher.find()) {
                       allMatches.add(matcher.group());
                }
                if(allMatches.size() > 1){
                    System.out.println(allMatches.get(2));
                    allMatches.set(2, "//"); // replace any number of slashes with only two
                    validURL = new URL(allMatches.toString());

                }else{
                    throw new RuntimeException("Expected slashes after URL shema definition but found none.");
                }
                System.out.println(matcher.group(1));
                System.out.println(matcher.group(2));
                System.out.println(matcher.group(3));
                    System.out.println(matcher.group(4));

            }else{
                throw new RuntimeException("Given url is not valid. URL shema is not detected");
            }
        }
        return validURL;
    }

}

TEST

@Test
    public void testHttpURLConverter2() throws MalformedURLException{
        assertEquals("http://google.com", new HttpURLConverter().validateURL(new URL("http:///google.com")));
    }
@Test
    public void testHttpURLConverter2() throws MalformedURLException{
        assertEquals("http://google.com", new HttpURLConverter().validateURL(new URL("http:/google.com")));
    }

2 个答案:

答案 0 :(得分:2)

除了@Dishi Jain的解决方案......仔细看看你的测试用例。您尝试将类型为String的对象与类型为URL的对象进行比较(=方法validateURL的返回类型)。即使该方法现在已正确实施。您的测试用例将始终失败(作为String - 对象永远不是URL - 对象)。

做类似的事情:

@Test
public void testHttpURLConverter2() {
    assertEquals("http://google.com", new HttpURLConverter().validateURL(new URL("http:/google.com")).toString());
}

或者

@Test
public void testHttpURLConverter2() {
    assertEquals(new URL("http://google.com"), new HttpURLConverter().validateURL(new URL("http:/google.com")));
}

答案 1 :(得分:1)

这是我能想到的最佳解决方案。您需要保持检查和进一步处理100%的成功结果。此方法将打印两个测试输入的验证URL。

import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class HttpURLConverter {

    final private String UrlPattern = "((([A-Za-z]{3,9}:(?:\\/\\/)?)(?:[\\-;:&=\\+\\$,\\w]+@)?[A-Za-z0-9\\.\\-]+|(?:www\\.|[\\-;:&=\\+\\$,\\w]+@)[A-Za-z0-9\\.\\-]+)((?:\\/[\\+~%\\/\\.\\w\\-_]*)?\\??(?:[\\-\\+=&;%@\\.\\w_]*)#?(?:[\\.\\!\\/\\\\\\w]*))?)";

    URL validateURL(URL url) throws MalformedURLException {
        //System.out.println(url);
        URL validURL = null;
        if (!Pattern.matches(UrlPattern, url.toString())) {
            if (Pattern.matches("(https?|ftp|file):.*", url.toString())) {
                Matcher matcher = Pattern
                        .compile("(https?|ftp|file):(\\/)*([A-za-z0-9\\.\\-?#_]+)([A-za-z0-9\\.\\-?#_\\/]{0,})", Pattern.CASE_INSENSITIVE)
                        .matcher(url.toString());

                List<String> allMatches = new ArrayList<String>();
                while (matcher.find()) {
                    allMatches.add(matcher.group());
                }

                for (String str : allMatches) {
                    String regex = "(\\/)+";
                    str = str.replaceAll(regex, "//");
                    validURL = new URL(str);
                    System.out.println("Validated URL : " + validURL);
                }

            } else {
                throw new RuntimeException("Given url is not valid. URL shema is not detected");
            }
        }

        return validURL;
    }

    public static void main(String[] args) throws MalformedURLException {
        new HttpURLConverter().validateURL(new URL("http:////google.com"));
    }

    }

您获得以下输出:

http:////google.com
Validated URL : http://google.com