为什么不获取完整的源代码?

时间:2017-02-15 10:43:19

标签: java

private static String[] getUrlSource2(String site) throws IOException {
    List<String> myList = new ArrayList<String>();
    URL url = new URL(site);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection(); //     Cast shouldn't fail
    HttpURLConnection.setFollowRedirects(true);
    conn.setRequestProperty("Accept-Encoding", "gzip, deflate");
    String encoding = conn.getContentEncoding();
    InputStream inStr = null;


    if (encoding != null && encoding.equalsIgnoreCase("gzip")) {
        inStr = new GZIPInputStream(conn.getInputStream());

    } else {
        inStr = conn.getInputStream();
    }

    BufferedReader in = new BufferedReader(new InputStreamReader(inStr,"UTF-8"));
    String inputLine;
    while ((inputLine = in.readLine()) != null)
        myList.add(inputLine);
    in.close();

    String[] arr = myList.toArray(new String[myList.size()]);
    return arr;

    }

这是我的getSource方法,由于某种原因它只是给我一部分url页面的源代码,我无法弄清楚为什么.. 如果你能提供帮助,我会深深感到沮丧。

例如,如果你运行:

public class Main  {

public static void main(String[] args){
    try {
        String [] A =getUrlSource2("https://www.google.pt/");

    for(int i=0;i<A.length;i++){
        System.out.print(String.valueOf(i)+"        ");
        System.out.println(A[i]);


    }

}catch(IOException e){

    }
}

当你应该获得大约300/400

时,你会获得5行源代码

0 个答案:

没有答案