使用Java Client在Cassandra中进行多线程异步写入

时间:2015-11-19 13:25:43

标签: java multithreading cassandra cassandra-2.0

我也是cassandra和Java的新手。我正在尝试读取一个包含100万条记录的文件,并尝试将其转储到cassandra数据库中,使用executorservice.Keyspace已经存在。但是在完成代码之后,我在DB中只获得了10条记录。这有什么不对?我应该改变什么?

我的代码如下。

public class UsingThread {


    public static void main(String[] args) throws InterruptedException, ExecutionException, FileNotFoundException {

            Cluster cluster1 = Cluster.builder().addContactPoint("127.0.0.1").withRetryPolicy(DefaultRetryPolicy.INSTANCE).withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy())).build();
            final String csvFile = "C:/Users/AT/workspace1/catalog.csv";


            final  BufferedReader   br = new BufferedReader(new FileReader(csvFile));


              //Creating Session object
              final Session session = cluster1.connect("demo");


                  String query2 = "CREATE TABLE InventoryAB(Item_ID text PRIMARY KEY, "+"Desc1 text, "
                 + "Quality text, "
                 + "Node_No int, "
                 + "Type text, "
                 + "Curr text,"+ "Manu text, "+"Desc2 text, "+"Qty int);";
                session.execute(query2);

                PreparedStatement statement = session.prepare("INSERT INTO demo.InventoryAB(Item_ID, Desc1, Quality, Node_No, Type, Curr, Manu, Desc2, Qty) VALUES (?,?,?,?,?,?,?,?,?)");

        int amountOfThreads = 10;
        ExecutorService threadPool = Executors.newFixedThreadPool(amountOfThreads);
        ExecutorCompletionService<String> tasks = new ExecutorCompletionService<String>(threadPool);
        long currentTimeE = System.nanoTime();
        for(int i=0; i < amountOfThreads; i++) {
            tasks.submit(new Callable<String>() {

                @Override
                public String call() throws Exception {

                    String ajay="";

                    String line = "";
                     while ((line = br.readLine()) != null) {
                        //System.out.println("Current thread is"+line);



                ajay=line;
                }


                    return ajay;
                }
            });
        }

            for(int i=0; i < amountOfThreads; i++) {
                Future<String> task = tasks.take();
                 String line = task.get();
                    String cvsSplitBy = ",";
                String[] column = line.split(cvsSplitBy);
                int h= Integer.parseInt(column[3]);
                int u= Integer.parseInt(column[8]);
                BoundStatement bind = statement.bind(column[0], column[1], column[2],h,column[4],column[5],column[6],column[7],u);
                session.executeAsync(bind);
            }

        threadPool.shutdown();
        long currentTimeF = System.nanoTime();
           long total2=currentTimeF-currentTimeE;
           System.out.println("Total time taken to load"+total2);
           try {
            br.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        session.close();
          cluster1.close();

        System.exit(0);
    }
}

1 个答案:

答案 0 :(得分:2)

问题在于您如何阅读文件的逻辑。线程只保留读取的最后一行。

但是

您永远不应该从多个线程访问一个文件!!!

在你的例子中,每个线程都很幸运并且至少有一条线但是这不能保证......

此外,不需要这一切。

session.executeAsync()

负责异步插入Cassandra。