使用htmlunit捕获网页

时间:2018-11-17 08:19:19

标签: java web htmlunit

这是我的代码。我想访问一个网站并比较我的数据。我希望Java将数据放入字段中,然后自动单击“计算底部”,然后将答案返回给Java。

import com.gargoylesoftware.htmlunit.WebClient;

public class MyWebServiceAccess {

  public static void main(String[] args) throws Exception 
  {

        final WebClient webClient = new WebClient();

        final HtmlPage page = webClient.getPage("https://www.socscistatistics.com/tests/signedranks/Default2.aspx");

        // Inputs
        HtmlTextInput treatment1 = (HtmlTextInput) page.getElementById("ctl00_MainContent_TextBox1");
        HtmlTextInput treatment2 = (HtmlTextInput) page.getElementById("ctl00_MainContent_TextBox2");

        // Significance Level:
        HtmlRadioButtonInput s1= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList1_0");
        HtmlRadioButtonInput s2= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList1_1");


        // 1 or 2-tailed hypothesis?:
        HtmlRadioButtonInput t1= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList2_0");
        HtmlRadioButtonInput t2= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList2_1");

        // Calculate
        HtmlSubmitInput Calculate= (HtmlSubmitInput) page.getElementById("ctl00_MainContent_Button2");

        // Result Span
        HtmlSpan result = (HtmlSpan) page.getElementById("ctl00_MainContent_Label9");

        // Fill in Inputs 
        treatment1.setValueAttribute("");
        treatment2.setValueAttribute("");

        s1.setChecked(true);
        s2.setChecked(false);

        t1.setChecked(true);
        t2.setChecked(false);

        Calculate.click();

        // Printing the Output
        System.out.println(result.asText());

        webClient.closeAllWindows();

   }

}

1 个答案:

答案 0 :(得分:0)

在抓取网页时,您需要基本了解网络技术(http / html)的工作方式,并且还需要一些Java /编程知识。至少能够发现程序中的问题真的很有帮助。

首先,您的代码会产生类强制转换异常,因为输入字段是文本区域,而不是输入文本控件。第二,您必须单击右键(您的代码单击“重置”按钮)。最后,如果单击该按钮,您将获得一个新页面。您的结果在新页面上。

希望有帮助。...

String url = "https://www.socscistatistics.com/tests/signedranks/Default2.aspx";                                   

try (final WebClient webClient = new WebClient()) {                                       
    HtmlPage page = webClient.getPage(url);                                                                        

    // Inputs                                                                                                      
    HtmlTextArea treatment1 = (HtmlTextArea) page.getElementById("ctl00_MainContent_TextBox1");                    
    HtmlTextArea treatment2 = (HtmlTextArea) page.getElementById("ctl00_MainContent_TextBox2");                    

    // Significance Level:                                                                                         
    HtmlRadioButtonInput s1= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList1_0");   
    HtmlRadioButtonInput s2= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList1_1");   
    s1.setChecked(true);                                                                                           
    s2.setChecked(false);                                                                                          

    // 1 or 2-tailed hypothesis?:                                                                                  
    HtmlRadioButtonInput t1= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList2_0");   
    HtmlRadioButtonInput t2= (HtmlRadioButtonInput) page.getElementById("ctl00_MainContent_RadioButtonList2_1");   
    t1.setChecked(true);                                                                                           
    t2.setChecked(false);                                                                                          

    // Fill in Inputs                                                                                              
    treatment1.type("4\n3\n2\n5\n5\n3");                                                                           
    treatment2.type("1\n2\n3\n0\n0\n2");                                                                           

    // click Calculate creates a new page                                                                          
    HtmlSubmitInput calculate= (HtmlSubmitInput) page.getElementById("ctl00_MainContent_Button1");                 
    page = calculate.click();                                                                                      

    // Result Span                                                                                                 
    HtmlSpan result = (HtmlSpan) page.getElementById("ctl00_MainContent_Label9");                                  

    // Printing the Output                                                                                         
    System.out.println(result.asText());                                                                           
}