String url = String.format("http://%s.jpg.to", URLEncoder.encode("свинья", "utf-8"));
new URL(url).openStream();
Document doc = Jsoup.connect(url).get();
我想在网址中阅读带有俄语符号的网页,但要捕获异常(Android 4.1.1):
W/System.err: java.net.UnknownHostException: http://%D1%81%D0%B2%D0%B8%D0%BD%D1%8C%D1%8F.jpg.to
W/System.err: at libcore.net.http.HttpConnection$Address.<init>(HttpConnection.java:283)
W/System.err: at libcore.net.http.HttpConnection.connect(HttpConnection.java:128)
W/System.err: at libcore.net.http.HttpEngine.openSocketConnection(HttpEngine.java:315)
W/System.err: at libcore.net.http.HttpEngine.connect(HttpEngine.java:310)
W/System.err: at libcore.net.http.HttpEngine.sendSocketRequest(HttpEngine.java:289)
W/System.err: at libcore.net.http.HttpEngine.sendRequest(HttpEngine.java:239)
W/System.err: at libcore.net.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:80)
W/System.err: at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563)
W/System.err: at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540)
W/System.err: at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227)
W/System.err: at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216)
W/System.err: at test.jpgto.MainActivity$RetrieveImageTask.doInBackground(MainActivity.java:63)
W/System.err: at test.jpgto.MainActivity$RetrieveImageTask.doInBackground(MainActivity.java:49)
W/System.err: at android.os.AsyncTask$2.call(AsyncTask.java:287)
W/System.err: at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:305)
W/System.err: at java.util.concurrent.FutureTask.run(FutureTask.java:137)
W/System.err: at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:230)
W/System.err: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076)
W/System.err: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569)
W/System.err: at java.lang.Thread.run(Thread.java:856)
但链接http://2.jpg.to/(例如)工作正常。我做错了什么?
答案 0 :(得分:1)
当您只是将字符放在URL上时会发生什么? 例如,尝试这样的事情:
String host = "свинья";
//here we now do string-formatting and then call the convertUrlToPunycodeIfNeeded which uses IDN
String url= convertUrlToPunycodeIfNeeded(String.format("http://%s.jpg.to", host));
//then simply use the URL
new URL(url).openStream();
Document doc = Jsoup.connect(url).get();
以下是显示如何在您的案例中使用java.net.IDN的代码:
//The translation of characters to their Latin equivalent
public static String convertUrlToPunycodeIfNeeded(String url) {
if (!Charset.forName("US-ASCII").newEncoder().canEncode(url)) {
if (url.toLowerCase().startsWith("http://")) {
url = "http://" + IDN.toASCII(url.substring(7));
} else if (url.toLowerCase().startsWith("https://")) {
url = "https://" + IDN.toASCII(url.substring(8));
} else {
url = IDN.toASCII(url);
}
}
return url;
}
我找到了这个很好的例子here - 示例1 :
答案 1 :(得分:0)
String url = String.format("http://%s.jpg.to", IDN.toASCII("свинья"));