为什么JSOUP不读为UTF-8?

时间:2014-06-14 08:51:14

标签: java html utf-8 jsoup

我希望jsoup解析为utf -8,但我不能。我尝试了所有我知道的东西,并在谷歌搜索。

我的目标是什么:

String tmp_html_content ="Öç";

InputStream is = new ByteArrayInputStream(tmp_html_content.getBytes());            
Document doc_tbl  =  Jsoup.parse(is, "UTF-8", ""); 
doc_tbl.outputSettings().charset().forName("UTF-8");
doc_tbl.outputSettings().escapeMode(EscapeMode.xhtml);

doc_tbl不是UTF-8

请帮忙解决这个问题

1 个答案:

答案 0 :(得分:2)

public static void main(String []args){
        System.out.println("Hello World");

        String tmp_html_content ="Öçasasa";

        InputStream is = new ByteArrayInputStream(tmp_html_content.getBytes());            
        org.jsoup.nodes.Document doc_tbl;
        try {
            doc_tbl = Jsoup.parse(is, "ISO-8859-9", "");
              ((org.jsoup.nodes.Document) doc_tbl).outputSettings().charset().forName("UTF-8");
                ((org.jsoup.nodes.Document) doc_tbl).outputSettings().escapeMode(EscapeMode.xhtml);
                String htmlString = doc_tbl.toString();
                System.out.println(htmlString);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();

        } 

     }

out put

Hello World       Öçasasa