Jsoup,无法检索对象

时间:2013-04-28 11:13:21

标签: android html-parsing jsoup

我正在尝试使用jsoup api,我的第一个尝试是从搜索按钮的google.it文本中检索并在textview中显示它。 我用过这段代码:

protected String doInBackground(Void... params) {
    Document doc;
    try {
        doc = Jsoup.connect("http://www.google.it/").get();
        Elements cerca_con_google = doc.select("button[id=gbqfba[aria-label]]");
            int size = cerca_con_google.size();
            Log.i("AAAAAAAAAA", Integer.toString(size));

            if(cerca_con_google != null) {
                return cerca_con_google.text();
            }   
            return "foo";
}

但元素的大小始终为零。也许我对select方法中的选择器查询错了......还是别的什么?

2 个答案:

答案 0 :(得分:2)

Google根据标头值返回不同的内容,例如User-Agent

public static void main(String... args) throws IOException {
    {
        System.out.println("-- Default header");
        Connection conn = Jsoup.connect("http://www.google.it/");
        Document document = conn.get();

        {
            Elements elems = document.select("input");
            System.out.println("input - " + elems.size());
            for (Element elem : elems) {
                System.out.println("input [" + elem.attributes() + "]");
            }
        }
        {
            Elements elems = document.select("button");
            System.out.println("button - " + elems.size());
            for (Element elem : elems) {
                System.out.println("button [" + elem.attributes() + "]");
            }
        }
    }
    {
        System.out.println("-- Custom header");
        Connection conn = Jsoup.connect("http://www.google.it/");
        conn.header("User-Agent", "Firefox/20.0");
        Document document = conn.get();

        {
            Elements elems = document.select("input");
            System.out.println("input - " + elems.size());
            for (Element elem : elems) {
                System.out.println("input [" + elem.attributes() + "]");
            }
        }
        {
            Elements elems = document.select("button");
            System.out.println("button - " + elems.size());
            for (Element elem : elems) {
                System.out.println("button [" + elem.attributes() + "]");
            }
        }
        {
            Element elem = document.select("button#gbqfbb").first();
            System.out.println();
            System.out.println("button#gbqfbb = " + elem);
        }
    }
}

输出

-- Default header
input - 7
input [ name="ie" value="ISO-8859-1" type="hidden"]
input [ value="it" name="hl" type="hidden"]
input [ name="source" type="hidden" value="hp"]
input [ autocomplete="off" class="lst" value="" title="Cerca con Google" maxlength="2048" name="q" size="57" style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top"]
input [ class="lsb" value="Cerca con Google" name="btnG" type="submit"]
input [ class="lsb" value="Mi sento fortunato" name="btnI" type="submit" onclick="if(this.form.q.value)this.checked=1; else top.location='/doodles/'"]
input [ type="hidden" id="gbv" name="gbv" value="1"]
button - 0
-- Custom header
input - 4
input [ type="hidden" name="output" value="search"]
input [ type="hidden" name="ie" value="UTF-8"]
input [ type="hidden" name="sclient" value="psy-ab"]
input [ id="gbqfq" class="gbqfif" name="q" type="text" autocomplete="off" value=""]
button - 3
button [ id="gbqfb" aria-label="Cerca con Google" class="gbqfb" name="btnG"]
button [ id="gbqfba" aria-label="Cerca con Google" name="btnK" class="gbqfba"]
button [ id="gbqfbb" aria-label="Mi sento fortunato" name="btnI" class="gbqfba" onclick="if(this.form.q.value)this.checked=1;else window.top.location='/doodles/'"]

button#gbqfbb = <button id="gbqfbb" aria-label="Mi sento fortunato" name="btnI" class="gbqfba" onclick="if(this.form.q.value)this.checked=1;else window.top.location='/doodles/'"><span id="gbqfsb">Mi sento fortunato</span></button>

答案 1 :(得分:0)

我认为你使用了错误的选择器。请改为使用“button [id = gbqfba]”。