如何比较两个HashMap <string,list <string =“”>&gt; </string,>

时间:2014-07-28 20:27:29

标签: java excel jdbc hashmap apache-poi

我想比较从excel文件读取的数据(键是第1列,值是第2列),该数据被放入带有从SQL查询获得的数据的HashMap中。起初我使用HashMap&lt;字符串,字符串&gt;因为我只需要比较&lt;键,值&gt;对,但现在我必须比较&lt;键,列表&gt;我有点卡住了。 这是我读取xls文件的代码:

public class ReadExcel {

    HashMap<String, List<String>> result = new HashMap<String, List<String>>();

public HashMap<String, List<String>> process() {
    try
    {
        result.clear();

        FileInputStream file = new FileInputStream(new File("C:/some.xlsx"));

        //Create Workbook instance holding reference to .xlsx file
        XSSFWorkbook workbook = new XSSFWorkbook(file);

        //Get first/desired sheet from the workbook
        XSSFSheet sheet = workbook.getSheetAt(0);

        //Iterate through each rows one by one
        Iterator<Row> rowIterator = sheet.iterator();


        while (rowIterator.hasNext()) {
            List<String> xlsList = new ArrayList<String>();

                Row row = rowIterator.next();
                Cell cell  = row.getCell(1);
                Cell cell2 = row.getCell(2);
                String key ="";
                String value="";
                xlsList.clear();
                switch (cell.getCellType())
                {
                    case Cell.CELL_TYPE_NUMERIC:
                        key = getStringCellValue(cell);
                        value = getNumericCellValue(cell2);
                        break;

                    case Cell.CELL_TYPE_STRING:
                        key = getStringCellValue(cell);
                        value = getStringCellValue(cell2);
                        break;

                }

                xlsList.add(value);
                result.put(key, xlsList);
        }
    }
}

例如在我的excel文件中

row 1: column 1 = car, column 2 = blue
row 2: column 1 = car, column 2 = yellow.

当我运行excel阅读器时,它会在HashMap的“car”键下将值“blue,yellow”设置得非常好。 当我有例如:

row 1: column 1 = car,  column 2 = blue
row 2: column 1 = car,  column 2 = yellow
row 3: column 1 = year, column 2 = 1990
row 4: column 1 = year, column 2 = 1999

仅显示:car=[yellow], year=[1999]。它只需要最后的值,如果不是重复键,这个工作正常。

第一个问题:我怎样才能做得更好?如果我在第1行获得相同的密钥只能获取一次密钥,并且如果第1行的密钥相同,则存储第2行的所有值?

以下是我从SQL数据库中提取数据的代码:

public class DB {
    HashMap<String, List<String>> result = new HashMap<String, List<String>>();

public HashMap<String, List<String>> process() {

    result.clear();

    Connection conn = null;
    Statement stmt = null;
    List<String> carColour = new ArrayList<String>();

try {
        Class.forName("oracle.jdbc.driver.OracleDriver");
        conn = DriverManager.getConnection(DB_URL, USER, PASS);
        stmt = conn.createStatement();
        String sql1 = "SOME SQL SELECT STATEMENT THAT RETURNS 2 or more lines, by that i mean "car" has 2 values or more";
        ResultSet rs = stmt.executeQuery(sql1);

        while(rs.next()){
            carColour.add(rs.getString("colour")); // i select the column "colour"
            result.put("car", carColour);         // i put "car" as key, and "blue" and "yellow" as values
            ...................................
        }
} catch...
}

这个数据库的代码工作正常,但如果我想提取更多列,如“颜色”或“年”,我必须为所有这些列创建列表,如果我有20列提取,它可以非常耗时。

第二个问题:我怎么能这样做更容易而不是创建20个列表?(使用相同的列表可能和list.clear();它?)(这取决于因为如果我有30列,我只需要20,我可以getString(“column”)all并删除我不想要的,但是如何?

以下是比较代码:

final Map<String, Boolean> comparisonResult = compareEntries(dbResult, xlsResult);
        for(final Entry<String, Boolean> entry : comparisonResult.entrySet()){
            if (entry.getValue() == false){
                System.out.println("------------------------------------------------------------------------");
                System.out.println("| Comparison FAILED | Value not matching! Column name --> " + entry.getKey() + " |");
            }
        }
        System.out.println("------------------------------------------------------------------------");
        System.out.println("DB consistency check finished.");

............................................... .................................

public static <K extends Comparable<? super K>, V>
Map<K, Boolean> compareEntries(final Map<K, V> dbResult,
    final Map<K, V> xlsResult){
    final Collection<K> allKeys = new HashSet<K>();
    allKeys.addAll(dbResult.keySet());
    allKeys.addAll(xlsResult.keySet());
    final Map<K, Boolean> result = new TreeMap<K, Boolean>();
    for(final K key : allKeys){
        result.put(key, dbResult.containsKey(key) == xlsResult.containsKey(key) && Boolean.valueOf(equal(dbResult.get(key), xlsResult.get(key))));
    }
    return result;
}

private static boolean equal(final Object obj1, final Object obj2){
    return obj1 == obj2 || (obj1 != null && obj1.equals(obj2));
}

最后一个问题:如何改进代码来比较两个HashMap(String,List&lt; String&gt;),或者我如何才能更好地逐步完成?谢谢!

1 个答案:

答案 0 :(得分:0)

您的ReadExcel阅读循环存在严重问题:您已经创建了一个新密钥,而不是重新使用已经由密钥映射的List,因此您最终会拥有列表只有一个(最后一个)值。

这是解决这个问题的方法(使用Java 8):

// ...
while (rowIterator.hasNext()) {
    Row row = rowIterator.next();
    Cell keyCell = row.getCell(1);
    Cell valCell = row.getCell(2);

    String key = getStringCellValue(keyCell);
    switch (cell.getCellType()) {
        case Cell.CELL_TYPE_NUMERIC:
            value = getNumericCellValue(valCell);
            break;
        case Cell.CELL_TYPE_STRING:
            value = getStringCellValue(valCell);
            break;
        }
    }

    // this line only compiles since Java 8
    result.computeIfAbsent(key, k -> new HashSet<Object>()).add(value);
}

如果您还没有使用Java 8,请快速下载,或者用上面的最后一行替换:

Set<Object> list = result.get(key);
if (list == null) result.put(key, list = new HashSet<Object>());
list.add(value);

Map个对象的比较相当简单 - 您可以使用equals直接比较它们:

if (map1.equals(map2)) {
    // both maps are equal!
} else {
    // maps are NOT equal!
}

当然,这只会产生truefalse。如果您想了解详细信息(哪些条目不同),您需要自己完成。