jsoup没有连接到包含urdu单词的url

时间:2016-07-13 04:51:02

标签: java android jsoup

这段代码出了什么问题:

Document doc = Jsoup.connect("www.dw.com/ur/مارشل-لاء-کا-مطالبہ-سازش-یا-خواہش؟/a-19395440?maca=urd-rss-urd-all-1497-xml-mrss").get();

当我尝试打开连接时,它会打开www.dw.com但我想打开这个www.dw.com/ur/مارشل-لاء-کا-مطالبہ-سازش-یا-خواہش?/ a-19395440 ?马卡= URD的RSS-URD的所有-1497-XML的MRSS。

我认为这是因为这个url hava urdu的话 你觉得我怎么解决?

1 个答案:

答案 0 :(得分:1)

使用HttpClient和uriencoding

String url = "http://www.dw.com/ur/مارشل-لاء-کا-مطالبہ-سازش-یا-خواہش؟/a-19395440?maca=urd-rss-urd-all-1497-xml-mrss";
url = StringUtils.replaceEach(URLEncoder.encode(url, "UTF-8"), new String[]{"+", "*", "%7E"}, new String[]{"%20", "%2A", "~"})
HttpClient httpClient = HttpClientBuilder.create().build();
HttpGet httpget = new HttpGet(url);
HttpResponse response = httpClient.execute(httpget);
BasicResponseHandler bh = new BasicResponseHandler();
String res = new String(bh.handleResponse(response));
Document doc = Jsoup.parse(res);