https表格获取和发布解析html项目android的任何提示?

时间:2012-06-22 14:21:26

标签: https webview html-parsing jsoup

我正在创建一个应该执行以下操作的Android应用程序;

  1. 使用https(SSL!)页面上的表单登录并接收cookie
  2. 发出httpGET动作以获取html
  3. 解析该HTML并将其显示在视图,列表或其他内容中。
  4. 我一直在玩Jsoup,httpUnit和HTMLUnit很长一段时间了,但我遇到了几个问题;

    一个。登录很好,工作..(我得到网站的欢迎页面)但是,当我发出GET声明(并包括cookie)时,我被重定向到登录表单。所以响应html不是我的预期。 (可能与keepalivestrategy有关吗?)

    B中。 InputBuffers太小,无法接收整个HTML页面并将其设置为进行解析。

    注意:我无法控制网络服务器

    我对此非常陌生,所以教程或代码片段会有所帮助。

    例如,这是我用来登录网站的内容:

    public int checkLogin() throws Exception {
    
        ArrayList<NameValuePair> data = new ArrayList<NameValuePair>();
        data.add(new BasicNameValuePair("userid", getUsername()));
        data.add(new BasicNameValuePair("password", getPassword()));
        data.add(new BasicNameValuePair("submit_login", "Logmein"));
    
        Log.d(TAG, "Cookie name : " + getCookieName());
        Log.d(TAG, "Cookie cont : " + getCookie());
    
        HttpPost request = new HttpPost(BASE_URL);
    request.getParams().setBooleanParameter(CoreProtocolPNames.USE_EXPECT_CONTINUE, false);
    request.getParams().setParameter("http.protocol.handle-redirects",false);
        request.setEntity(new UrlEncodedFormEntity(data, "UTF-8"));
    
        HttpResponse response;
    
        httpsclient.getCookieStore().clear();
    
        List<Cookie> cookies = httpsclient.getCookieStore().getCookies();
        Log.d(TAG, "Number of Cookies pre-login : "  + cookies.size());
    
        response = httpsclient.execute(request);
    
        cookies = httpsclient.getCookieStore().getCookies();
        Log.d(TAG, "Number of Cookies post-login : "  + cookies.size());
        String html = "";
    
        // Problem : buffer is too small!
    
        InputStream in = response.getEntity().getContent();
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));
        StringBuilder str = new StringBuilder();
        String line = null;
        while ((line = reader.readLine()) != null) {
            str.append(line);
        }
        in.close();
        html = str.toString();
    
        Document doc = Jsoup.parse(html);
        Log.v(TAG, "Ik heb nu dit : " + doc.toString());
    
        if (cookies.size() > 0){
            storeCookie(cookies.get(0).getName(), cookies.get(0).getValue());
            return MensaMobileActivity.REQUEST_SUCCESS;
        } else {
            return MensaMobileActivity.REQUEST_ERROR;           
        }
    
    }
    

1 个答案:

答案 0 :(得分:0)

您根本不处理SSL证书,这至少是问题的一部分。我最近也开始努力学习这一点。此代码块将从您正在访问的网页中获取SSL证书。

    try {
        URL url = new URL(YOUR_WEBPAGE_HERE);
        HttpsURLConnection connect = (HttpsURLConnection)url.openConnection();
        connect.connect();
        Certificate[] certs = connect.getServerCertificates();

        if (certs.length > 0) {

            cert = new File("YOUR_PATH_TO_THE_FILE");
            //write the certificate obtained to the cert file.
            OutputStream os = new FileOutputStream(cert);
            os.write(certs[0].getEncoded());

            return true;
        }
    }
    catch (SSLPeerUnverifiedException e) {
        e.printStackTrace();
    }
    catch (MalformedURLException e) {
        e.printStackTrace();
    }
    catch (IOException e) {
        e.printStackTrace();
    }
    catch (CertificateEncodingException e) {
        e.printStackTrace();
    }