JSoup:如何从表单中删除元素?

时间:2017-09-29 06:08:21

标签: java jsoup

我使用jsoup解析html页面并提交表单。我需要在提交表单之前删除“后退”按钮。我使用element.remove()方法,但后来我发现form.formData()没有改变。请求的元素已从form.children()中删除,但存在于form.elements()中。这是一个错误还是我用错误的方法从表单中删除元素?

public class JsoupCheck {
    public static void main(String[] args) {
        String html = "<html><body><form action=\"demo\">"
                + "<input type=\"submit\" name=\"buttonSave\" value=\"Save\">"
                + "<input type=\"submit\" name=\"buttonBack\" value=\"Back\">"
                + "<select name=\"selection\">"
                + "  <option value=\"value1\">Value 1</option>"
                + "  <option value=\"value2\" selected>Value 2</option>"
                + "  <option value=\"value3\">Value 3</option>"
                + "</select>"
                + "</form></body></html>";
        Document doc = Jsoup.parse(html);
        FormElement form = (FormElement) doc.select("form").first();
        Element e = form.select("form").first();

        System.out.println("=== Original content of form");
        System.out.println(e);
        System.out.println("=== Original content of form.formData()");
        for (Connection.KeyVal i : form.formData()) {
            System.out.println(i.key() + "=" + i.value());
        }
        System.out.println("form.elements().size() = " + form.elements().size());
        System.out.println("form.children().size() = " + form.children().size());

        e.select("input[name=buttonBack]").remove();
        System.out.println();

        System.out.println("=== Content of form after remove buttonBack (result: buttonBack removed)");
        System.out.println(e);
        System.out.println("=== Content of form.formData() after remove buttonBack (result: buttonBack exist)");
        for (Connection.KeyVal i : form.formData()) {
            System.out.println(i.key() + "=" + i.value());
        }
        System.out.println("form.elements().size() = " + form.elements().size());
        System.out.println("form.children().size() = " + form.children().size());
    }
}

输出是:

=== Original content of form
<form action="demo">
 <input type="submit" name="buttonSave" value="Save">
 <input type="submit" name="buttonBack" value="Back">
 <select name="selection"> <option value="value1">Value 1</option> <option value="value2" selected>Value 2</option> <option value="value3">Value 3</option></select>
</form>
=== Original content of form.formData()
buttonSave=Save
buttonBack=Back
selection=value2
form.elements().size() = 3
form.children().size() = 3

=== Content of form after remove buttonBack (result: buttonBack removed)
<form action="demo">
 <input type="submit" name="buttonSave" value="Save">
 <select name="selection"> <option value="value1">Value 1</option> <option value="value2" selected>Value 2</option> <option value="value3">Value 3</option></select>
</form>
=== Content of form.formData() after remove buttonBack (result: buttonBack exist)
buttonSave=Save
buttonBack=Back
selection=value2
form.elements().size() = 3
form.children().size() = 2

1 个答案:

答案 0 :(得分:2)

FormElement是一种特殊的节点。除了维护所有子项的列表(继承自Node)之外,它还包含表单中所有元素的第二个内部列表。

public class FormElement extends Element {
    private final Elements elements = new Elements();
    ...
}

当您给孩子打电话Node#remove时,它会更新父母的孩子列表,而不是内部列表。

因此,要真正删除元素,还需要将其从此内部列表中删除:

e.select("input[name=buttonBack]").remove();
form.elements().removeIf(e -> e.attr("name").equals("buttonBack"));