使用HttpUnit检索最终的重定向URL

时间:2013-03-28 10:50:37

标签: java http web-crawler http-unit

我正在使用HttpUnit来模拟这个网站:http://www.voyages-sncf.com/ 这是我的代码:它没有向我发送最终的重定向网址只是搜索的网址而不是结果

public class TestHttpUnit {

    public static void main(String[] args) throws Exception {

        // Create and initialize WebClient object
        WebClient webClient = new WebClient(BrowserVersion.FIREFOX_10);
        webClient.setThrowExceptionOnScriptError(false);
        webClient.setRefreshHandler(new RefreshHandler() {
            public void handleRefresh(Page page, URL url, int arg) throws IOException {
                    System.out.println("handleRefresh");
            }
        });

        // visit Yahoo Mail login page and get the Form object
        HtmlPage page = (HtmlPage) webClient.getPage("http://www.voyages-sncf.com/");
        //Trouver le formulaire par le nom
        HtmlForm form = page.getFormByName("TrainTypeForm");
        //Trouver le formulaire avec l'action 
        //HtmlForm form = page.getFirstByXPath("//form[@action='http://www.voyages-sncf.com/dynamic/expressbooking/_SvExpressBooking']");


        // Enter login and password of 
        form.getInputByName("origin_city").setValueAttribute("paris");
        form.getInputByName("destination_city").setValueAttribute("marseille");
        form.getInputByName("outward_date").setValueAttribute("29/03/2013");


        // Click "Sign In" button/link
        page = (HtmlPage) form.getInputByValue("Rechercher").click();



        System.out.println(page.asText());
    }
}

1 个答案:

答案 0 :(得分:0)

根据HTTP Unit cookbook,表单提交应按照以下方式处理:

WebForm form = resp.getForms()[0];      // select the first form in the page
// ... fill in form fields, and finally:
form.submit();                          // submit the form

之后,您将进入“等待”屏幕。我没试过,但也许你应该这样做(伪代码):

do {
    // Wait a while
    wc.waitForBackgroundJavaScript(3000);
    // Somehow refresh the page, since that's what your browser might do, too
} while (someTestToVerifyThatYoureStillOnTheWaitingPage(wc));