Question

我已经安装了具有3个节点的Flink独立集群，我想从数据库中选择一些数据。为了获得高性能，我将一个sql拆分为3个，希望3个sql可以在分布式的3个节点上运行，但实际上3个sql在同一节点上运行。这是我的代码。

    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(3);

    for (int i = 0; i < 3; i++) {
        String from = String.valueOf(i*1000);
        String to = String.valueOf((i+1)*1000);
        String sql = "select * from table a where a.id >= " + from + " and a.id <= " + to;
        DataSet<Row> source = env.createInput(JDBCInputFormat.buildJDBCInputFormat()
                .setDrivername("org.postgresql.Driver")
                .setDBUrl("jdbc:postgresql://****:5432/test")
                .setUsername("root")
                .setPassword("****")
                .setQuery(sql)
                .setRowTypeInfo(new RowTypeInfo(BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.DATE_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.BIG_DEC_TYPE_INFO))
                .finish()).rebalance();
        DataSet<Tuple2<String, Float>> top = source.flatMap(new MyFlot()).groupBy(0).sum(1).sortPartition(1, Order.DESCENDING).first(10);
        top.print();
    }
    env.execute();

那么我该如何运行分布式的sql，以便可以充分利用flink集群。谢谢！

Apache Flink：如何在多个节点上运行多个InputFormat

0 个答案: