从GET响应(刮刀)中丢失了一块“set-Cookie”?

时间:2017-06-09 12:10:49

标签: java authentication web-crawler httprequest scraper

我正在尝试使用https://sso-prod.sun.ac.za/cas/login进行身份验证 但我在初次获取请求时收到的cookie似乎不完整

这是firefox收到的内容:

enter image description here enter image description here

但是我从我的请求中获得的Cookie:

    Cookies:
    ""
    BIGipServerpool_cas_sso_443=1954670738.47873.0000; path=/
    JSESSIONID=828CAEC2F2D093C1331494B5B719D48F; Path=/cas; Secure

我的java代码:

/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package logintest;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.CookieHandler;
import java.net.CookieManager;
import java.net.CookiePolicy;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.List;
import java.util.Map;
import javax.net.ssl.HttpsURLConnection;


public class Test2 {

    public static void main(String[] args) throws MalformedURLException, IOException {
        String url = "https://sso-prod.sun.ac.za/cas/login";
        CookieHandler.setDefault(new CookieManager(null, CookiePolicy.ACCEPT_ALL)); //turn on cookies

        URL obj = new URL(url);
        HttpsURLConnection conn = (HttpsURLConnection) obj.openConnection();
        conn.setRequestMethod("GET");
        conn.setUseCaches(false);

        String USER_AGENT = "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:53.0) Gecko/20100101 Firefox/53.0";
        String HOST = "sso-prod.sun.ac.za";
        String ACCEPT = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";

        conn.setRequestProperty("Host", HOST);
        conn.setRequestProperty("User-Agent", USER_AGENT);
        conn.setRequestProperty("Accept", ACCEPT);
        conn.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
        conn.setRequestProperty("Accept-Encoding", "gzip, deflate, br");
        conn.setRequestProperty("Connection", "keep-alive");
        conn.setRequestProperty("Upgrade-Insecure-Requests", "1");
        conn.setRequestProperty("Cache-Control", "max-age=0");
        conn.setRequestProperty("Pragma", "no-cache");
        conn.setRequestProperty("Cache-Control", "no-cache");

        BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
        String inputLine;
        StringBuffer response = new StringBuffer();

        while ((inputLine = in.readLine()) != null) {
            response.append(inputLine);
        }
        in.close();

        PrintWriter pw = new PrintWriter("S:\\page2.html");
        pw.println(response);
        pw.close();

        int response_code = conn.getResponseCode();
        System.out.println("response code: " + response_code);

        Map<String, List<String>> headerFields = conn.getHeaderFields();
        for (Map.Entry<String, List<String>> entry : headerFields.entrySet()) {
            String key = entry.getKey();
            List<String> value = entry.getValue();
            System.out.print(key + "|");
            for (String x : value) {
                System.out.print(x + ",");
            }
            System.out.println("");
        }

        System.out.println("-----");
        List<String> cookies = conn.getHeaderFields().get("Set-Cookie");
        for (String x : cookies) {
            System.out.println(x);
        }

            //what we want
            //JSESSIONID=18551836F6B56A3EF53AC90848CE6BE0; Path=/cas; Securebbbbbbbbbbbbbbb=IFAGLBABAAAAAAAACJPKHFAAAAAAAAAAEADAMEJJDGJJAAAADAAALODFDODFAAAA; HttpOnlyf5_cspm=1234;
            //what we got
            //f5_cspm=1234;BIGipServerpool_cas_sso_443=1954670738.47873.0000; path=/JSESSIONID=F7F993AE732E19B595E2F39FBCFA649A; Path=/cas; Secure
        }
    }

1 个答案:

答案 0 :(得分:1)

对于有此问题的其他人,解决方法是删除

CookieHandler.setDefault(new CookieManager(null, CookiePolicy.ACCEPT_ALL));