在flink操作中终止对数据库的请求

时间:2018-05-17 15:43:23

标签: cassandra kotlin apache-flink flink-sql

我正在尝试与Flink和Cassandra合作。两者都是大规模并行的环境,但我很难让它们一起工作。

现在我需要通过不同的令牌范围对Cassandra进行并行读取操作,并且可以在N个对象读取后终止查询

批处理模式可以让我更多,但也可以使用DataStreams。 我试过LongCounter(见下文),但它不会像我预期的那样工作。我没能得到他们的全球总和。只有当地价值观。

异步模式不是必须的,因为此操作CassandraRequester是在并行上下文中执行的,并行化大约为64或128.

这是我的尝试

class CassandraRequester<T> (val klass: Class<T>, private val context: FlinkCassandraContext):
        RichFlatMapFunction<CassandraTokenRange, T>() {

    companion object {
        private val session = ApplicationContext.session!!
        private var preparedStatement: PreparedStatement? = null
        private val manager = MappingManager(session)
        private var mapper: Mapper<*>? = null
        private val log = LoggerFactory.getLogger(CassandraRequesterStateless::class.java)

        public const val COUNTER_ROWS_NUMBER = "flink-cassandra-select-count"
    }

    private lateinit var counter: LongCounter

    override fun open(parameters: Configuration?) {
        super.open(parameters)

        if(preparedStatement == null)
            preparedStatement = session.prepare(context.prepareQuery()).setConsistencyLevel(ConsistencyLevel.LOCAL_ONE)
        if(mapper == null) {
            mapper = manager.mapper<T>(klass)
        }
        counter = runtimeContext.getLongCounter(COUNTER_ROWS_NUMBER)

    }

    override fun flatMap(tokenRange: CassandraTokenRange, collector: Collector<T>) {

        val bs = preparedStatement!!.bind(tokenRange.start, tokenRange.end)

        val rs = session.execute(bs)
        val resultSelect = mapper!!.map(rs)
        val iter = resultSelect.iterator()
        while (iter.hasNext()) when {
            this.context.maxRowsExtracted == 0L || counter.localValue < context.maxRowsExtracted -> {
                counter.add(1)
                collector.collect(iter.next() as T)
            }
            else -> {
                collector.close()
                return
            }
        }
    }

}

在这种情况下是否可以终止查询?

0 个答案:

没有答案