我正在解析网页上的一些链接,然后测试这些链接是否存在。我正在将解析后的链接字符串转换为uri,问题是某些链接已经编码了如下字符:http://download.microsoft.com/download/6/3/c/63c1d527-9d7e-4fd6-9867-fd0632066740/kinect_qsg%20premium_bndl_en-fr-es.pdf
当我通过下面的代码传递时,我得到:http://download.microsoft.com/download/6/3/c/63c1d527-9d7e-4fd6-9867-fd0632066740/kinect_qsg%2520premium_bndl_en-fr-es.pdf
您可以看到编码%20的内容。我该如何避免这种情况?我应该首先解码我的字符串吗?如果是这样,最好的方法是什么?
URL url = null;
URI uri = null;
try {
url = new URL(checkUrl);
} catch (MalformedURLException e1) {
e1.printStackTrace();
}
try {
uri = new URI(url.getProtocol(), url.getAuthority(), url.getPath(), url.getQuery(), url.getRef());
} catch (URISyntaxException e1) {
e1.printStackTrace();
}
答案 0 :(得分:2)
尝试使用URLDecoder类,
URL url = null;
URI uri = null;
String checkUrl = "http://download.microsoft.com/download/6/3/c/63c1d527-9d7e-4fd6-9867-fd0632066740/kinect_qsg%20premium_bndl_en-fr-es.pdf";
try {
url = new URL(URLDecoder.decode(checkUrl,"UTF-8"));
} catch (MalformedURLException e1) {
e1.printStackTrace();
} catch (UnsupportedEncodingException e1) {
e1.printStackTrace();
}
try {
uri = new URI(url.getProtocol(), url.getAuthority(), url.getPath(), url.getQuery(), url.getRef());
System.out.println(uri.getHost());
} catch (URISyntaxException e1) {
e1.printStackTrace();
}
该类的类路径是java.net.URLDecoder
答案 1 :(得分:1)
您可以使用:
String decoded = URLDecoder.decode(yorUrl, "UTF-8");