如何检查应该是某种格式的文本文件中的字符串的有效性?

时间:2016-03-14 03:09:29

标签: java regex

嗨我有一个文本文件,我正在读取输入,输入应该是URL格式(这种格式只是一个例子)。 URL格式的第一部分是方案。该方案由一个或多个字母的字符串组成,后跟字符串"://",so" http://"是一个有效的计划。还有一个路径,它是一个或多个字母,数字,句点('。')和正斜杠的字符串。因此,有效的URL将包含紧跟路径的方案。

这是有效的:http://example.com/hello/world.html

这是有效的:this123://is.a/valid.url/456
这将无效:no-scheme-url.com/index.htm

最终目标是告诉用户他们放入文本文件的URL的格式是否有效。继承了我到目前为止,请帮助谢谢!!

   public class URL {
   public static void main (String[]args) throws FileNotFoundException {
   Scanner console = new Scanner(System.in);           
   System.out.println("Name of file: ");
   String inputFile = console.next();
   File file = new File(inputFile);
 Scanner in = new Scanner(file);
 ArrayList<String> list=new ArrayList<>();
while(in.hasNext()) {
    list.add(in.nextLine());
    if (list.contains("://")){
    System.out.print("valid");
}else  {
   System.out.print("invalid");
}
  }
  }
   }

2 个答案:

答案 0 :(得分:0)

您可以使用以下正则表达式验证URI

String validationRegex = "^([a-z0-9+.-]+):(?://(?:((?:[a-z0-9-._~!$&'()*+,;=:]|%[0-9A-F]{2})*)@)?((?:[a-z0-9-._~!$&'()*+,;=]|%[0-9A-F]{2})*)(?::(\d*))?(/(?:[a-z0-9-._~!$&'()*+,;=:@/]|%[0-9A-F]{2})*)?|(/?(?:[a-z0-9-._~!$&'()*+,;=:@]|%[0-9A-F]{2})+(?:[a-z0-9-._~!$&'()*+,;=:@/]|%[0-9A-F]{2})*)?)(?:\?((?:[a-z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*))?(?:#((?:[a-z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*))?$"

Pattern p = Pattern.compile(validationRegex);
Matcher m = p.matcher(urlAddress);
boolean isValid = m.matches();

组成如下:

^
([a-z][a-z0-9+.-]*):                                    #1 scheme
(?:
    \/\/                                        it has an authority:

    (                                       #2 authority
        (?:(?=((?:[a-z0-9-._~!$&'()*+,;=:]|%[0-9A-F]{2})*))(\3)@)?      #4 userinfo
        (?=(\[[0-9A-F:.]{2,}\]|(?:[a-z0-9-._~!$&'()*+,;=]|%[0-9A-F]{2})*))\5    #5 host (loose check to allow for IPv6 addresses)
        (?::(?=(\d*))\6)?                           #6 port
    )

    (\/(?=((?:[a-z0-9-._~!$&'()*+,;=:@\/]|%[0-9A-F]{2})*))\8)?          #7 path

    |                                       it doesn't have an authority:

    (\/?(?!\/)(?=((?:[a-z0-9-._~!$&'()*+,;=:@\/]|%[0-9A-F]{2})*))\10)?      #9 path
)
(?:
    \?(?=((?:[a-z0-9-._~!$&'()*+,;=:@\/?]|%[0-9A-F]{2})*))\11           #11 query string
)?
(?:
    #(?=((?:[a-z0-9-._~!$&'()*+,;=:@\/?]|%[0-9A-F]{2})*))\12            #12 fragment
)?
$

答案 1 :(得分:0)

您可以使用Java的URL构造函数作为验证器:

boolean isValidUrl(String url) {
    try {
        new URL(url);
        return true;
    } catch (MalformedURLException e) {
        return false;
    }
}

编辑:您的代码目前无效的原因是因为您正在检查List.contains()而不是String.contains()。将循环更新为:

while(in.hasNext()) {
    String line = in.nextLine();
    list.add(line);
    if (line.contains("://")) {
        System.out.print("valid");
    } else {
        System.out.print("invalid");
    }
}

或者如果您想使用我的方法,请替换

if (line.contains("://"))

if (isValidUrl(line))