Question

我正在解析一个包含超过4M行的文件。它的形式为^ b ^ c ^ d ^ ...... ^ .... 现在我希望文件中的所有唯一点（只有前两个条目应该是唯一的）。所以我所做的就是，

String str;
Set<String> lines = new LinkedHashSet<String>();
Set<String> set = Collections.synchronizedSet(lines);
String str1[] = str.split("\\^");
set.add(str1[0]+"^"+str1[1]);

所以这给了我文件中唯一的第1和第2个独特点。但是，我也想要与上述点相关的第3点（时间戳），即str1 [2]。新文件的格式应为。

  str1[0]^str1[1]^str1[2]

我该怎么做呢？

Answer 1

我想到了一些解决方案。

为3个条目创建一个类。覆盖equals方法并仅检查那里的前两个条目，因此如果前两个条目相等，则2个对象相等。现在将所有项目添加到集合中。所以你在集合中得到的是一个列表，其中包含唯一的第一和第二点以及时间戳的第一个内容。
另一种解决方案是保留两个列表，一个带有2点+时间戳，一个带有2点。您可以执行set.contains（...）来检查您是否已经看到了该点，如果您没有添加2点+时间戳列表。

Answer 2

创建一个包含您将要存储在集合中的信息的类，但只关注equals / hashCode中的前两个。然后你可以这样做：

Set<Point> set = new HashSet<Point>();
String str1[] = str.split("\\^");
set.add(new Point(str1[0], str1[1], str1[2]));

使用：

public class Point {

    String str1;
    String str2;
    String str3;

    public Point(String str1, String str2, String str3) {
        this.str1 = str1;
        this.str2 = str2;
        this.str3 = str3;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((str1 == null) ? 0 : str1.hashCode());
        result = prime * result + ((str2 == null) ? 0 : str2.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Point other = (Point) obj;
        if (str1 == null) {
            if (other.str1 != null)
                return false;
        } else if (!str1.equals(other.str1))
            return false;
        if (str2 == null) {
            if (other.str2 != null)
                return false;
        } else if (!str2.equals(other.str2))
            return false;
        return true;
    }
}

从文件中获取唯一条目

2 个答案: