如何使用jsoup发布javascript表单?

时间:2016-05-06 10:27:11

标签: javascript java forms jsoup

我想从http://www.wettportal.com/quotenarchiv/中提取一些数据。

有一个javascript表单: Search

<form id="archivesearchform" name="archivesearchform" method="post" action="">
...
<td class="ralign">Sportart:</td>
<td>
<select name="sport_id" id="sport_id" style="width:100%">
...
<td class="ralign">Land:</td>
<td>
<select name="region_id" id="region_id" style="width:100%;">
...
<td class="ralign">Liga:</td>
<td>
<select name="league_id" id="league_id" style="width:100%">
...
<td class="ralign">vom:</td>
<td>
<input type="text" name="fromdate" id="fromdate" style="width:100%" />
...
<td class="ralign">vom:</td>
<td>
<input type="text" name="fromdate" id="fromdate" style="width:100%" />
</td>
<td class="ralign">bis:</td>
<td>
<input type="text" name="tilldate" id="tilldate" style="width:100%" />
</td>
<td colspan="2"></td>
</tr>
<tr>
<td class="ralign">Teilnehmer:</td>
<td colspan="3"><input type="text" name="team" style="width:100%" /></td>
<td colspan="2"></td>
</tr>
</tbody>

和提交按钮:

<tr>
<td class="lalign"></td>
<td class="calign"><input type="submit" name="btnSubmit" value="Suchen" /></td>
<td class="ralign"><div class="loading-animation" id="div_loading"></div></td>
</tr>

我尝试使用此代码:

import java.io.IOException; 

import org.jsoup.Jsoup; 
import org.jsoup.nodes.Document; 
import org.jsoup.nodes.Element; 
import org.jsoup.select.Elements; 

public class QAJesoupE { 

    public static void main(String[] args) { 

        try { 
            Document doc = Jsoup.connect("http://www.wettportal.com/quotenarchiv/")
                .data("sport_id", "4")
                .data("region_id", "16")
                .data("league_id", "0")
                .data("fromdate", "")
                .data("tilldate", "")
                .data("team", "")
                // and other hidden fields which are being passed in post request.
                .userAgent("Mozilla")
                .post();
                System.out.println(doc); // will print html source of homepage of facebook.

        } catch (IOException e) { 
            e.printStackTrace(); 
            } 
    } 
}

但我只得到没有任何搜索结果的HTML代码。 : - /

可以请任何人帮助我吗?

提前多多感谢!

1 个答案:

答案 0 :(得分:2)

此网站上有一个处理表单提交的脚本。即使form元素定义POST,脚本实际上也会发送get请求,并将数据作为网址参数:

http://www.wettportal.com/lib/ajax/getArchivedEvents.php?partner=wettportal&lang=de&sport_id=4&region_id=23&league_id=0&fromdate=&tilldate=&team=

Jsoup 会为您创建请求网址(带参数),但您必须发送GET请求并包含X-Requested-With标头(见下文):

Document doc = Jsoup
    .connect("http://www.wettportal.com/lib/ajax/getArchivedEvents.php")
    .data("sport_id", "4")
    .data("region_id", "16")
    .data("league_id", "0")
    .data("fromdate", "")
    .data("tilldate", "")
    .data("team", "")
    .header("X-Requested-With", "XMLHttpRequest")
    .timeout(10000)
    .get();