Question

下面是使用poi读取excel文件的代码：工作正常

public class ReadExcelDemo { 
    public static void main(String[] args)  { 
 try {           
     FileInputStream file = new FileInputStream(new File("demo.xlsx"));  
     List sheetData = new ArrayList();

    XSSFWorkbook workbook = new XSSFWorkbook(file); 

    XSSFSheet sheet = workbook.getSheetAt(0);
  ArrayList<Form> vipList = new ArrayList<Form>();
    Iterator<Row> rowIterator = sheet.iterator();   
    while (rowIterator.hasNext()) {            
        Row row = rowIterator.next();

        Iterator<Cell> cellIterator = row.cellIterator();   
        List data = new ArrayList();

        while (cellIterator.hasNext())  { 

            Cell cell = cellIterator.next();    

            switch (cell.getCellType())                     {        
                case Cell.CELL_TYPE_NUMERIC:  System.out.print(cell.getNumericCellValue() + "\t"); 
            break;                       
                case Cell.CELL_TYPE_STRING: System.out.print(cell.getStringCellValue() + "\t");  
            break;     
            }           
        }

    }  


    }

现在，如果excel包含重复记录，我应该能够打印一条简单的错误消息。我该怎么做？

示例：

ID    Firstname     Lastname     Address
  1     Ron           wills      Paris
  1     Ron           wills      London

现在，我想仅检查3列的重复项：ID，名字和姓氏。如果这些列一起包含上述示例中显示的相同数据，则需要将其视为重复。

我有一个pojo类Form，包含带有getter的id，firstname和lastname

和二传手。每个读取的记录都使用setter方法写入pojo类。然后我使用getter获取值并将它们添加到arraylist对象。现在列表对象包含所有记录。我如何比较它们？

Answer 1

将数据放入集合中，并在每个新条目之前检查包含。如果您使用HashSet，它将非常快。你可以假装一切都是字符串比较。

        Set data = new HashSet();

    while (cellIterator.hasNext())  { 

        Cell cell = cellIterator.next();    
        if(data.contains(cell.getStringCellValue())
            trow new IllegalDataException()
        data.add(cell.getStringCellValue();

        switch (cell.getCellType())                     {        
            case Cell.CELL_TYPE_NUMERIC:  System.out.print(cell.getNumericCellValue() + "\t"); 
        break;                       
            case Cell.CELL_TYPE_STRING: System.out.print(cell.getStringCellValue() + "\t");  
        break;     
        }           
    }

如果您需要实际比较整行，您可以创建一个包含所有字段的类，然后只需覆盖equals方法。然后把它扔进一套并进行比较。

Answer 2

public class ProcessAction extends DispatchAction {

    String dupValue = null;
    ArrayList<String> dupList = new ArrayList<String>();

    private String validateDuplicateRecords(ProcessForm process) {
        String errorMessage = null;

        dupValue = process.getId.trim()+"    "+process.getFirstname().trim()+"    "+process.getLastanme().trim();
        mLogger.debug("order id,ctn,item id: "+dupValue);
        if (dupList.contains(dupValue)){
            mLogger.debug("value not added");
            errorMessage = "Duplicate Record Exists";
        } else {
            dupList.add(dupValue);
        }

        return errorMessage;
    }
}

不要忘记清除重复的arraylist。我执行某些任务之后的情况，例如将arraylist写入文件我正在使用以下方法清除重复的arraylist：

dupList.clear();

如果你不这样做，那么当你再次上传相同的数据时，即使记录不重复也会发生重复，因为dupList arraylist包含以前上传的数据。

Answer 3

这里有个提示。循环时，在哈希图中添加您的ID（用于检查重复项的值）。如果映射的大小未更改，则它是一条重复的记录，因为如果键已经存在，则会相互覆盖。这是我的代码中的一个示例：

switch(cellType)
{
case 0:
    your_id = cell1.getNumericCellValue();
    mapSize = map.size();

    map.put(your_id, your_id);
    mapSizeAfterPut = map.size();

    if(mapSize == mapSizeAfterPut)
    {
        duplicatedRecordsList.add(index);
    }

    break;
case 1:
    your_id = cell1.getStringCellValue();
    mapSize = map.size();

    map.put(your_id , your_id);
    mapSizeAfterPut = map.size();

    if(mapSize == mapSizeAfterPut) 
    {
        duplicatedRecordsList.add(index);
    }

    break;
default:break;
}

如何使用POI检查excel中的重复记录？

3 个答案: