删除组内的重复列值

时间:2014-09-04 04:52:07

标签: java arraylist collections treeset

我创建了一个csv文件的行值的string []数组,并将其存储在数组列表中。我需要根据arr [0]对其进行分组,并删除该组中任何重复的数组值。

每次可以有n个列。我已经采用了3列,例如

List<String[]> rowList = new ArrayList<String[]>();
BufferedReader reader = null;
reader = new BufferedReader(new FileReader("C:\\test.csv"));

String[] currLineSplitted;
while (reader.ready()) {
   currLineSplitted = reader.readLine().split(",");
   rowList.add(currLineSplitted);
}

Set<String[]> s = new TreeSet<String[]>(new Comparator<String[]>() {

    @Override
    public int compare(String[] o1, String[] o2) {
        int cmp = 0;
        if((o1[0]).compareTo(o2[0])==1){
            for(int i=1;i<currLineSplitted.length;i++){
            cmp = (o1[i]).compareTo(o2[i]);
            }
        } else {
            cmp=0;
        }

        return cmp;
    }
});

s.addAll(rowList);

List<Object> res = Arrays.asList(s.toArray());
for(Object obj:res){
    String[] arr = (String[])obj;
    System.out.println(arr[0]+","+arr[1]+","+arr[2]);
}

输入文件:

{"1","a","gh"}        
{"1","a","rs"}        
{"1","b","cd"}
{"2","a","xy"}
{"2","b","xy"}
{"3","a","pq"}

输出:

1,a,gh
2,b,xy

必需的输出:

1,a,gh
1,a,rs //should be deleted as in group 1 a is repeated
1,b,cd
2,a,xy
2,b,xy //should be deleted as in group 2 xy is repeated
3,a,pq

2 个答案:

答案 0 :(得分:0)

创建一个类,例如ArrayClass

public class ArrayClass{

private String firstItem,secondItem,thirdItem;

  public ArrayClass(String[] param){
    firstItem = param[0];
    secondItem = param[1];
    thirdItem = param[2];
  }

//getters and setters
}

然后覆盖equalshashCode方法

@Override
public boolean equals(Object obj) {

    // TODO Auto-generated method stub
    if (this == obj) return true;

    if (obj == null || (this.getClass() != obj.getClass())) {
        return false;
    }

    ArrayClass aC = (ArrayClass) obj;

    return (this.firstItem.equals(aC.getFirstItem())
            && this.secondItem.equals(aC.getSecondItem()))
            || (this.firstItem.equals(aC.getFirstItem())
            && this.thirdItem.equals(aC.getThirdItem()));
}

@Override
public int hashCode() {

    // TODO Auto-generated method stub
    // up to you how you compute your hashcode to be unique
    return thirdItem != null ? thirdItem.hashCode() : 0;
}

然后在您的主要课程中使用Set而不是List

Set<ArrayClass> testSet = new HashSet<ArrayClass>();

然后修改您的while loop

while (reader.ready()) {
                ArrayClass aC = new ArrayClass(reader.readLine().split(","));
                testSet.add(aC);
            }

显示输出

for(ArrayClass aC : testSet){
            System.out.println(aC.getFirstItem()+" "+aC.getSecondItem()+" "+aC.getThirdItem());
        }

输出:

1,a,gh
1,b,cd
2,a,xy
3,a,pq

答案 1 :(得分:0)

你几乎是对的。我稍微修改了你的比较功能。所以用这个

替换你的比较函数
       @Override
        public int compare(String[] o1, String[] o2) {
            int cmp = 0;

            if(o1[0].equals(o2[0])){//grouping 1st column
                for(int i=1;i<o1.length;i++){
                    cmp = (o1[i]).compareTo(o2[i]);
                    if(cmp==0)
                        return cmp;// if two column matched return immediately
                }
            } else {
                return o1[0].compareTo(o2[0]);
            }

            return cmp;
        }

记住无法保证 String.compare 将返回1.它按字典顺序比较两个字符串,如果两个字符串匹配则返回零。 所以在你的代码中,后续行创建了一个逻辑错误。

o1[0]).compareTo(o2[0])==1

详细了解字符串比较器here