我们如何每隔一小时处理一次kinesis流数据?

时间:2018-04-06 10:04:35

标签: amazon-web-services amazon-kinesis amazon-kcl

我有一个用putRecord连续写的kinesis流。在我使用KCL的processRecords消费的消费者中。只要流中有记录,就会调用此进程记录。我需要每隔X小时执行一次此进程记录。

以下是我尝试过的事情。

kinesisClientLibConfiguration = new KinesisClientLibConfiguration(applicationName, kinesisStreamName,
            credsProvider, workerId).withInitialPositionInStream(initialPositionInStream)
                    .withCallProcessRecordsEvenForEmptyRecordList(true)
                    .withIdleTimeBetweenReadsInMillis(3600000) //1 hr in millis
                    .withKinesisEndpoint(kinesisEndpoint);

这确实似乎有效。它抛出以下异常

    [INFO] 2018-02-08T18:41:24.124 com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask getRecordsResult  ShardId shardId-000000000003: getRecords threw ExpiredIteratorException - restarting after greatest seqNum passed to customer
com.amazonaws.services.kinesis.model.ExpiredIteratorException: Iterator expired. The iterator was created at time Thu Feb 08 13:06:04 UTC 2018 while right now it is Thu Feb 08 13:11:23 UTC 2018 which is further in the future than the tolerated delay of 300000 milliseconds. (Service: AmazonKinesis; Status Code: 400; Error Code: ExpiredIteratorException; Request ID: d72ba81b-9a14-cf1d-85dd-e6a01179f91c)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
    at com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2276)
    at com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2252)
    at com.amazonaws.services.kinesis.AmazonKinesisClient.executeGetRecords(AmazonKinesisClient.java:1062)
    at com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:1038)
    at com.amazonaws.services.kinesis.clientlibrary.proxies.KinesisProxy.get(KinesisProxy.java:158)
    at com.amazonaws.services.kinesis.clientlibrary.proxies.MetricsCollectingKinesisProxyDecorator.get(MetricsCollectingKinesisProxyDecorator.java:74)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisDataFetcher.getRecords(KinesisDataFetcher.java:74)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.SynchronousGetRecordsRetrievalStrategy.getRecords(SynchronousGetRecordsRetrievalStrategy.java:31)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.BlockingGetRecordsCache.getNextResult(BlockingGetRecordsCache.java:50)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.getRecordsResultAndRecordMillisBehindLatest(ProcessTask.java:377)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.getRecordsResult(ProcessTask.java:342)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:159)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

有人可以告诉我是否有解决方法可以解决这个问题? 感谢

0 个答案:

没有答案