Question

我通过解析html文档获得了一组元素。元素可能包含重复项。仅列出唯一元素的最佳方法是什么？

我来自C ++背景，看到使用set和自定义相等操作来实现它的可能性。但是，不确定如何在Java中执行此操作。感谢任何能帮助我以正确有效的方式做到的代码。

ArrayList<Element> values = new ArrayList<Element>();

// Parse the html and get the document
Document doc = Jsoup.parse(htmlDataInStringFormat);

// Go through each selector and find all matching elements
for ( String selector: selectors ) {

    //Find elements matching this selector
    Elements elements = doc.select(selector);

    //If there are no matching elements, proceed to next selector
    if ( 0 == elements.size() ) continue;

    for (Element elem: elements ){
        values.add(elem);
    }
}

if ( elements.size() > 0 ) {
    ????? // Need to remove duplicates here
}

Answer 1

java.util.HashSet将为您提供一个无序集合，API中还有java.util.Set的其他扩展名，如果需要，它们将为您提供有序集合或并发行为。

根据类Element的不同，您可能还需要在其上实现equals和hashCode函数。根据@musical_coder的评论。

例如：

Set<Element> set = new HashSet<Element>(elements);

为了提供一个重写的equals方法或元素，我将为自己的MyElement创建一个围绕Element类的瘦包装器，或者更可感知的名称，例如

    public static class MyElement extends Element {

        private final Element element;

        public MyElement(Element element){
            this.element = element;
        }

        // OverRide equals and Hashcode
        // Delegate all other methods
    }

并将其传递到集合中，好吧所以现在我希望课程不是最终的。有效地包装了这堂课中的所有元素。啊ElementWrapper是一个更好的名字。

Answer 2

将元素添加到java.util.HashSet，它只包含唯一元素。

Answer 3

如果您只是想避免重复，请使用HashSet。如果您想要排序以及避免重复，请使用树集

Answer 4

另外覆盖Element

的equals和hashCode方法

class Element {
...

public boolean equals(Object o) {
    if (! (o instanceof Element)) {
    return false;
}
Element other = (Element)o;
//compare the elements of  this and o like
if (o.a != this.a) { return false;}
...

}
...
public int hashCode() {
    //compute a value that is will return equal hash code for equal objects
}
}

Answer 5

如果有可能修改元素，那么发布的答案会有效，但我不能这样做。我不需要一个有序的集合，因此这是我找到的解决方案..

TreeSet<Element> nt = new TreeSet<Element>(new Comparator<Element>(){
        public int compare(Element a, Element b){
            if ( a == b ) 
                return 0;
            if ( (a.val - b.val) > 0 )
                return 1;
            return -1;
        }
    });

for (Element elem: elements ){
    nt.add(elem);
}

创建唯一对象列表

5 个答案: