CSVJDBC - 在聚合函数中解释字符串而不是整数

时间:2015-07-01 14:08:52

标签: java sql csv csvjdbc

我正在使用CSVJDBC驱动程序从CSV文件中检索结果。所有记录字段都被解释为字符串。如何利用MAX聚合函数来获得列的最大整数?据我所知,csvjdbc不支持强制转换。

考虑这个示例文件:

sequenceNumber,decimalNumber,randomInteger,email,testNumber
0,0.4868176550817932,560801,cleta.stroman@gmail.com,0.0
1,0.9889360969432277,903488,chelsie.roob@hotmail.com,1.0
2,0.8161798688893893,367870,hardy.waelchi@yahoo.com,2.0
3,0.926163166852633,588581,rafaela.white@hotmail.com,3.0
4,0.05084859872223901,563000,belle.hagenes@gmail.com,4.0
5,0.7636864392027013,375299,joey.beier@gmail.com,5.0
6,0.31433980690632457,544036,cornell.will@gmail.com,6.0
7,0.4061012200967966,41792,catalina.kemmer@gmail.com,7.0
8,0.3541002754332119,196272,raoul.bogisich@yahoo.com,8.0
9,0.4189826302561652,798405,clay.roberts@yahoo.com,9.0
10,0.9076084714059381,135783,angel.white@yahoo.com,10.0
11,0.565716974613909,865847,marlin.hoppe@gmail.com,11.0
12,0.9484076609924861,224744,anjali.stanton@gmail.com,12.0
13,0.05223710002804138,977787,harley.morar@hotmail.com,13.0
15,0.6270851001160621,469901,eldora.schmeler@yahoo.com,14.0

我使用以下代码段:

import org.relique.jdbc.csv.CsvDriver;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;


public class CSVDemo
{
    public static void main(String[] args)
    {
    try
    {
        // Load the driver.
        Class.forName("org.relique.jdbc.csv.CsvDriver");

        // Create a connection. The first command line parameter is
        // the directory containing the .csv files.
        // A single connection is thread-safe for use by several threads.

        String CSVDIRECTORY = "/tmp/csv-directory/";
        String CSVDB ="mediumList";
        Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + CSVDIRECTORY);

        // Create a Statement object to execute the query with.
        // A Statement is not thread-safe.
        Statement stmt = conn.createStatement();

        ResultSet results = stmt.executeQuery("SELECT MAX(decimalNumber) FROM "+CSVDB);

        // Dump out the results to a CSV file with the same format
        // using CsvJdbc helper function
        boolean append = true;
        CsvDriver.writeToCsv(results, System.out, append);

        // Clean up
        conn.close();
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
    }
}

执行查询时

我按预期得到了:

MAX([DECIMALNUMBER])
0.9889360969432277

但是当我需要最大的sequenceNumber时,这个

是19
ResultSet results = stmt.executeQuery("SELECT MAX(sequenceNumber)   FROM  "+CSVDB);

结果我得到9:

MAX([SEQUENCENUMBER])
9

它适用于decimalNumber,它也适用于文本。它不适用于testNumber,因为csvjdbs返回字典最大值而不是Integer值。是否有可能直接解决这个问题,或者我需要获取所有记录并使用Java选择最大值?

基本解决方案:

这是我的基本解决方案,需要首先获取所有数字:

        ResultSet results = stmt.executeQuery("SELECT sequenceNumber FROM "+CSVDB);
        int max=-1;

        while(results.next()){
            String sum = results.getString(1);

            int currentSeq = Integer.parseInt(sum);
            System.out.println("current_ "+sum);
            if(currentSeq>max){
                max=currentSeq;
            }

有更优雅的方式吗?

基于Joop Eggen的解决方案

public int getMaxSequenceAggregate() {
       int max = 0;
       try {
           Properties props = new Properties();
           Connection connection;

           props.put("columnTypes", "Int,Double,Int,String,Int");
           connection = DriverManager.getConnection("jdbc:relique:csv:" + this.directoryPath, props);
           PreparedStatement statement = null;
           ResultSet result;
           statement = connection.prepareStatement("SELECT MAX(sequenceNumber) FROM " + this.filePath);
           result = statement.executeQuery();

           while (result.next()) {
               max = result.getInt(1);
               LOGGER.info("maximum sequence: " + max);

           }

           connection.close();
       } catch (SQLException e) {
           e.printStackTrace();
       }

       return max;
   }

2 个答案:

答案 0 :(得分:2)

您应该更好地指定列类型,因为看起来第一列被视为字符串,其中"9" > "10"

Properties props = new Properties();
props.put("columnTypes", "Integer,Double,Integer,String,Integer");
Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + CSVDIRECTORY, props);

答案 1 :(得分:0)

CSV/JDBC documentation中的内容如下:

如果columnTypes设置为空字符串,则从数据中推断出列类型。

我猜这在大多数用例中都是可取的。 因此,使用Joop Eggen的示例可以将其简化为:

https://www.booking.com/searchresults.de.html?label=gen173nr-1DCAIoLDgcSAdYBGhSiAEBmAEHuAEHyAEM2AED6AEB-AECiAIBqAIDuAKz_uDyBcACAQ;sid=a3807e20e99c61282850cfdf02041c07;dest_id=204;dest_type=country&

我尝试了这一点,它演示了与其他JDBC驱动程序类似的动态类型检测。 想知道为什么这不是默认设置。