事件中心-未获得预期的吞吐量

时间:2019-04-04 23:24:57

标签: c# azure asynchronous azure-eventhub

我已经在标准层上设置了具有20个吞吐量单元和32个分区的Event Hub实例。根据文档,每个吞吐量单位等于1 MB /秒。因此,理想情况下,我应该获得20 MB /秒或1.2 GB /分钟的吞吐量。命名空间只有一个事件中心,而我是唯一的用户。事件中心设置在美国西部,是离发送请求位置最近的选项。

但是,我发现要花费至少10分钟才能处理1.77GB的数据。我正在使用异步批处理调用,并将每个请求打包到1 MB的限制。我看到SendBatchAsync调用花费的时间差异很大-从0.15到25秒不等。

这是我的代码: (请注意:我只能使用.Net Framework 4.5)

    static EventHubClient eventHubClient;        

    static Dictionary<int, List<EventData>> events = new Dictionary<int, List<EventData>>();
    static Dictionary<int, long> batchSizes = new Dictionary<int, long>();

    static long threshold = (long)(1e6 - 1000);

    static SemaphoreSlim concurrencySemaphore;
    static int maxConcurrency = 1;
    static void Main()
    {
        eventHubClient = EventHubClient.CreateFromConnectionString(connectionString, eventHubName);

        Stopwatch stopWatch = new Stopwatch();
        stopWatch.Start();            

        using (concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
        {
            foreach (string record in GetRecords())
            {
                Tuple<int, EventData> currentEventDetails = GetEventData(record);
                int partitionId = currentEventDetails.Item1;
                EventData currentEvent = currentEventDetails.Item2;

                BatchOrSendAsync(partitionId, currentEvent);

            }

            SendRemainingAsync();
        }

        stopWatch.Stop();
        Console.WriteLine(string.Format("### total time taken = {0}", stopWatch.Elapsed.TotalSeconds.ToString()));

    }

    static async void BatchOrSendAsync(int partitionId, EventData currentEvent)
    {
        long batchSize = 0;

        batchSizes.TryGetValue(partitionId, out batchSize);
        long currentEventSize = currentEvent.SerializedSizeInBytes;

        if( batchSize + currentEventSize > threshold)
        {
            List<EventData> eventsToSend = events[partitionId];
            if (eventsToSend == null || eventsToSend.Count == 0)
            {
                if (currentEventSize > threshold)
                    throw new Exception("found event with size above threshold");
                return;
            }

            concurrencySemaphore.Wait();
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();
            await eventHubClient.SendBatchAsync(eventsToSend);
            stopWatch.Stop();
            Console.WriteLine(stopWatch.Elapsed.TotalSeconds.ToString());
            concurrencySemaphore.Release();

            events[partitionId] = new List<EventData> { currentEvent };
            batchSizes[partitionId] = currentEventSize;
        }
        else
        {
            if (!events.ContainsKey(partitionId))
            { 
                events[partitionId] = new List<EventData>();
                batchSizes[partitionId] = 0;
            }
            events[partitionId].Add(currentEvent);
            batchSizes[partitionId] += currentEventSize;
        }
    }

    static async void SendRemainingAsync()
    {
        foreach(int partitionId in events.Keys)
        {                
            concurrencySemaphore.Wait();
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();                
            await eventHubClient.SendBatchAsync(events[partitionId]);
            stopWatch.Stop();
            Console.WriteLine(stopWatch.Elapsed.TotalSeconds.ToString());
            concurrencySemaphore.Release();
        }
    }

注意:增加信号量的maxConcurrency仅会降低整体花费的时间,并且当maxConcurrency为10时,SendBatchAsync调用会开始出错

我应该如何提高吞吐量?

0 个答案:

没有答案