Question

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import javax.net.ssl.HttpsURLConnection;
public class testa {
    public static void main(String[] args) throws IOException {
        String nextLine = "";
        URL url = null;
        URLConnection urlConn = null;       
        InputStreamReader  inStream = null;
        BufferedReader buff = null;
        try{
            url  = new URL("https://kickass.to");
            urlConn = url.openConnection();      
            ((HttpsURLConnection) urlConn).setHostnameVerifier(new Verifier());
            inStream = new InputStreamReader(urlConn.getInputStream());
            buff= new BufferedReader(inStream);
            while(nextLine != null){
                nextLine = buff.readLine();
                System.out.println(nextLine);               
            }   
        }catch(MalformedURLException e){
               System.out.println("Please check the URL:" +  e.toString() );
        } catch(IOException  e1){
            System.out.println("Can't read  from the Internet: "+ e1.toString() ); 
        }        
    }

 }

嘿，我想获得这个网站的源代码，当我在其他网站上使用它时代码可以工作，但是当我在www.kickass.to上使用它时，响应被编码或者其他东西看起来像这样

iÞŠpÃ2÷4rqy"pc‚Q‚ßÑÄ¶vnæö2”cnä.>*‰˜›m(Ïú¿p*s²™„J.û’›TÔÓµÄé¸˜aÈº3ÛTYÜè¾Eúm9ìbQ.n‚+ô"§€¾AêtY.¾ƒàj4Gœ9ðõaˆoPz–¡¹‹Ìo÷9íyh´4½ ÷ ¾ÏÀ|«M?E©Û”Þc\ñ°³%?øó"Y„&ÃƒixrN¾ç\-ÛÚ~>

有谁知道如何获取kickass.to的源代码？

Answer 1

如果您检查回复标题，则会注意到它们包含

content-encoding:gzip

如果您检查页面的源代码，您会注意到charset是UTF8。

所以你需要使用

读取流

inStream = new InputStreamReader(new GZIPInputStream(urlConn.getInputStream()), StandardCharsets.UTF_8);

Java URLConnection响应已编码

1 个答案: