这段代码出了什么问题:
Document doc = Jsoup.connect("www.dw.com/ur/مارشل-لاء-کا-مطالبہ-سازش-یا-خواہش؟/a-19395440?maca=urd-rss-urd-all-1497-xml-mrss").get();
当我尝试打开连接时,它会打开www.dw.com但我想打开这个www.dw.com/ur/مارشل-لاء-کا-مطالبہ-سازش-یا-خواہش?/ a-19395440 ?马卡= URD的RSS-URD的所有-1497-XML的MRSS。
我认为这是因为这个url hava urdu的话 你觉得我怎么解决?答案 0 :(得分:1)
使用HttpClient和uriencoding
String url = "http://www.dw.com/ur/مارشل-لاء-کا-مطالبہ-سازش-یا-خواہش؟/a-19395440?maca=urd-rss-urd-all-1497-xml-mrss";
url = StringUtils.replaceEach(URLEncoder.encode(url, "UTF-8"), new String[]{"+", "*", "%7E"}, new String[]{"%20", "%2A", "~"})
HttpClient httpClient = HttpClientBuilder.create().build();
HttpGet httpget = new HttpGet(url);
HttpResponse response = httpClient.execute(httpget);
BasicResponseHandler bh = new BasicResponseHandler();
String res = new String(bh.handleResponse(response));
Document doc = Jsoup.parse(res);