亚马逊Kinesis GetRecords Api最佳方法

时间:2014-10-10 11:01:39

标签: c# amazon-web-services amazon-kinesis

我在Amazon WebServices中设置了Kinesis流。我还想完成以下任务:

  1. 将记录放入具有单个碎片的单个流(C#Api) - 成功
  2. 我还编写了Sample App,其中多个Producer正在处理不同的Stream - SUCCESS
  3. 我还设置了Sample App来执行多个工作人员将数据放入单个流 - 成功
  4. 此外,我希望能够在Reacords中强制执行SequenceNumberOrdering。

    但真正的痛苦是使用Kinesis C#Api的GetRecords消费者操作。

    我为唱片创建了一个示例应用程序。问题是,即使Kinesis Stream中没有记录,它也不会停止迭代。同时将SequenceNumber保存在DB或某个文件中并再次检索文件非常耗时 - 使用Kinesis Stream获取GetRecords的优势是什么?

    为什么即使Stream中没有数据,它仍继续迭代?

    我使用以下代码进行参考;

      private static void GetFilesKinesisStream()
            {
                IAmazonKinesis kinesis = AWSClientFactory.CreateAmazonKinesisClient();
                try
                {
                    ListStreamsResponse listStreams = kinesis.ListStreams();
                    int numBuckets = 0;
                    if (listStreams.StreamNames != null &&
                        listStreams.StreamNames.Count > 0)
                    {
                        numBuckets = listStreams.StreamNames.Count;
                        Console.WriteLine("You have " + numBuckets + " Amazon Kinesis Streams.");
                        Console.WriteLine(string.Join(",\n", listStreams.StreamNames.ToArray()));
    
                        DescribeStreamRequest describeRequest = new DescribeStreamRequest();
                        describeRequest.StreamName = "******************";
    
                        DescribeStreamResponse describeResponse = kinesis.DescribeStream(describeRequest);
                        List<Shard> shards = describeResponse.StreamDescription.Shards;
                        foreach (Shard s in shards)
                        {
                            Console.WriteLine("shard: " + s.ShardId);
                        }
    
                        string primaryShardId = shards[0].ShardId;
    
                        GetShardIteratorRequest iteratorRequest = new GetShardIteratorRequest();
                        iteratorRequest.StreamName = "*********************";
                        iteratorRequest.ShardId = primaryShardId;
                        iteratorRequest.ShardIteratorType = ShardIteratorType.AT_SEQUENCE_NUMBER;
                        iteratorRequest.StartingSequenceNumber = "49544005271533118105145368110776211536226129690186743810";
    
                        GetShardIteratorResponse iteratorResponse = kinesis.GetShardIterator(iteratorRequest);
                        string iterator = iteratorResponse.ShardIterator;
    
                        Console.WriteLine("Iterator: " + iterator);
                        //Step #3 - get records in this iterator
                        GetShardRecords(kinesis, iterator);
    
                        Console.WriteLine("All records read.");
                        Console.ReadLine();
                    }
                    // sr.WriteLine("You have " + numBuckets + " Amazon S3 bucket(s).");
                }
                catch (AmazonKinesisException ex)
                {
                    if (ex.ErrorCode != null && ex.ErrorCode.Equals("AuthFailure"))
                    {
                        Console.WriteLine("The account you are using is not signed up for Amazon EC2.");
                        Console.WriteLine("You can sign up for Amazon EC2 at http://aws.amazon.com/ec2");
                    }
                    else
                    {
                        Console.WriteLine("Caught Exception: " + ex.Message);
                        Console.WriteLine("Response Status Code: " + ex.StatusCode);
                        Console.WriteLine("Error Code: " + ex.ErrorCode);
                        Console.WriteLine("Error Type: " + ex.ErrorType);
                        Console.WriteLine("Request ID: " + ex.RequestId);
                    }
                }
            }
    
            private static void GetShardRecords(IAmazonKinesis client, string iteratorId)
            {
                //create reqest
                GetRecordsRequest getRequest = new GetRecordsRequest();
                getRequest.Limit = 100;
                getRequest.ShardIterator = iteratorId;
    
    
                //call "get" operation and get everything in this shard range
                GetRecordsResponse getResponse = client.GetRecords(getRequest);
                //get reference to next iterator for this shard
                string nextIterator = getResponse.NextShardIterator;
                //retrieve records
                List<Record> records = getResponse.Records;
    
                //print out each record's data value
                foreach (Record r in records)
                {
                    //pull out (JSON) data in this record
                    string s = Encoding.UTF8.GetString(r.Data.ToArray());
                    Console.WriteLine("Record: " + s);
                    Console.WriteLine("Partition Key: " + r.PartitionKey);
                }
    
                if (null != nextIterator)
                {
                    //if there's another iterator, call operation again
                    GetShardRecords(client, nextIterator);
                }
            }
    

1 个答案:

答案 0 :(得分:1)

为什么一个kinesis消费者会在&#34; end&#34;之后继续迭代?数据?

因为没有&#34;结束&#34;。 Kinesis有点像队列,但不完全一样。可以把它想象成记录事件的移动时间窗口。您不会消耗记录,您会被动地检查当前在窗口中的记录(亚马逊硬编码为24小时)。因为窗口始终在移动,所以一旦到达&#34; last&#34;记录,它一直在实时观看。新记录可随时出现;消费者并不知道没有任何生产者。

如果您想根据某些条件停止,那么该条件必须包含在您的有效负载中。例如,如果你想要在现在&#34;现在&#34;时停止,你的部分有效载荷可能是一个时间戳,消费者会检查它与当前时间的接近程度。