Question

我有一个非常简单的API，它有一个单例实例，在整个应用程序中重用了许多HTTPClients。调用API，我创建一个任务列表，每个任务调用一个客户端。我使用CancellationToken和另一个严格超时的任务。

public class APIController : Controller
{
    private IHttpClientsFactory _client;
    public APIController(IHttpClientsFactory client)
    {
        _client = client;
    }

    [HttpPost]
    public async Task<IActionResult> PostAsync()
    {
        var cts = new CancellationTokenSource();
        var allTasks = new ConcurrentBag<Task<Response>>();
        foreach (var name in list)//30 clients in list here
        {
            allTasks.Add(CallAsync(_client.Client[name], cts.Token));
        }
        cts.CancelAfter(1000);
        await Task.WhenAny(Task.WhenAll(allTasks), Task.Delay(1000));
        //do something with allTasks 
    }
}

CallAsync也很简单，只需使用客户端来呼叫并等待答案。

 var response = await client.PostAsync(endpoint, content, token);

现在这段代码完美无缺，1秒后返回，然后将取消请求发送给尚未返回的任何任务。任务列表大约有30个客户端，因此API会随时调用30个端点，平均响应时间为800毫秒。

此应用程序每秒管理3000个并发呼叫，因此每秒 100k Httpclient呼叫。

问题是HttpClient存在一些瓶颈，实际上CPU总是非常高，我需要大约80（eigthy）16核虚拟机和32GB RAM来处理流量。显然有些不对劲。

我得到的一个提示是，在将我的金块包更新到Asp.net Core 2之前，完全相同的代码表现得更好。

我在服务器上做了一个诊断，我的代码中没有任何错误，但似乎HttpClients客户端正在等待彼此或卡住。

跟踪中没有其他内容。我使用Factory为每个端点创建单个实例：

  public class HttpClientsFactory : IHttpClientsFactory
{
    public static Dictionary<string, HttpClient> HttpClients { get; set; }

    public HttpClientsFactory()
    {
        HttpClients = new Dictionary<string, HttpClient>();
        Initialize();
    }

    private static void Initialize()
    {
        HttpClients.Add("Name1", CreateClient("http://......"));  
        HttpClients.Add("Name2", CreateClient("http://...."));
        HttpClients.Add("Name3", CreateClient("http://...."));

    }

    public Dictionary<string, HttpClient> Clients()
    {
        return HttpClients;
    }

    public HttpClient Client(string key)
    {
        try
        {
            return Clients()[key];
        }
        catch
        {
            return null;
        }
    }

    public static HttpClient CreateClient(string endpoint)
    {
        try
        {
            var config = new HttpClientHandler()
            {
                MaxConnectionsPerServer = int.MaxValue,
                AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
            };
            var client = new HttpClient(config)
            {
                Timeout = TimeSpan.FromMilliseconds(1000),
                BaseAddress = new Uri(endpoint)
            };

            client.DefaultRequestHeaders.Accept.Clear();
            client.DefaultRequestHeaders.Connection.Clear();
            client.DefaultRequestHeaders.ExpectContinue = false;
            client.DefaultRequestHeaders.ConnectionClose = false;
            client.DefaultRequestHeaders.Connection.Add("Keep-Alive");
            client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

            return client;
        }
        catch (Exception)
        {
            return null;
        }
    }
}

然后在启动

 services.AddSingleton<IHttpClientsFactory, HttpClientsFactory>();

这里发生了什么，HttpClient的单身人士对案件不利？我应该为每个线程创建一个HttpClient实例吗？我该怎么做？

更新

经过几天的测试后，我确定HTTPClient调用期间的超时会打开一些连接，导致端口耗尽。关于如何避免这种情况的任何建议？

Answer 1

由于HTTP设计的工作方式，听起来您正在达到操作系统对HTTP请求的限制。如果您使用的是.NET框架（而不是dotnet核心），那么您可以使用ServicePointManager调整操作系统管理HTTP请求的方式。大多数HTTP服务器和客户端使用keep-alive，这意味着它们将为多个HTTP请求重用相同的TCP连接。您可以使用ServicePointManager.FindServicePoint()函数获取与特定主机连接的ServicePoint实例。对同一主机的每个HTTP请求都使用相同的ServicePoint实例。

尝试调整ServicePoint中的一些值，看看它如何影响您的应用。例如ConnectionLimit，它控制将在客户端和服务器之间使用多少并行TCP连接。虽然keep-alive将允许您为新的HTTP请求重用现有的TCP连接，而pipelining（检查您连接的服务器是否支持使用SupportsPipelining），您可以在以下位置发送多个请求同时，HTTP要求响应与请求的顺序相同。这意味着初始慢响应将阻止所有后续请求/响应。通过具有多个TCP连接，您可以并行拥有多个非阻塞HTTP请求。但是当然还有一个缺点，因为你现在有多个TCP连接，它们必须互相争夺网络资源。所以，调整这个值，看它是否有所改善，但要小心！

如上所述，这里真正的问题可能是HTTP协议以及它如何一次只能真正处理一个请求。幸运的是，有一个解决方案，在HTTP / 2中，遗憾的是，asp.net尚未得到很好的支持。我没有尝试过，因为我所使用的系统还不支持它，但理论上它应该允许你并行发送和接收多个HTTP请求而不会阻塞。看看this thread，它描述了让它发挥作用的方法。

修改

我完全错过了你在asp.net core 2.0上运行。在那种情况下，你don't have access to ServicePointManager。但是如果您只需要在Windows上运行，那么您可以安装WinHttpHandler nuget并设置MaxConnectionsPerServer属性。但是，如果您使用的是WinHttpHandler，那么我建议您尝试使用HTTP / 2并查看是否可以为您改进。

<强> EDIT2

我们刚刚发现了一个问题，即POST请求花费的时间是GET请求的两倍，尽管请求的其余部分完全相同。我们在将localhost连接到本地服务器时从异地位置发现了该问题，因此ping比通常情况高得多。这表明在进行POST时进行了两次往返，而在进行GET时只进行了一次往返。解决方案是这两行：

ServicePointManager.UseNagleAlgorithm = false; ServicePointManager.Expect100Continue = false;

也许这对你有帮助吗？

Answer 2

尝试设置MaxConnectionsPerServer = Environment.ProcessorCount。

使用MaxConnectionsPerServer = int.MaxValue，您的程序会创建/使用太多线程，这些线程会相互竞争CPU资源。因此，在线程和线程开销之间切换会降低性能。

Answer 3

如果查看HttpClient的code，您会发现，如果Timeout属性不等于Threading.Timeout.InfiniteTimeSpan，则HttpClient会为每个请求创建带有超时的CancellationTokenSource：

        CancellationTokenSource cts;
        bool disposeCts;
        bool hasTimeout = _timeout != s_infiniteTimeout;
        if (hasTimeout || cancellationToken.CanBeCanceled)
        {
            disposeCts = true;
            cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _pendingRequestsCts.Token);
            if (hasTimeout)
            {
                cts.CancelAfter(_timeout);
            }
        }

查看CancelAfter（）的code，我们可以看到它在内部creates System.Threading.Timer对象中：

public void CancelAfter(Int32 millisecondsDelay)
    {
        ThrowIfDisposed();

        if (millisecondsDelay < -1)
        {
            throw new ArgumentOutOfRangeException("millisecondsDelay");
        }

        if (IsCancellationRequested) return;

        // There is a race condition here as a Cancel could occur between the check of
        // IsCancellationRequested and the creation of the timer.  This is benign; in the 
        // worst case, a timer will be created that has no effect when it expires.

        // Also, if Dispose() is called right here (after ThrowIfDisposed(), before timer
        // creation), it would result in a leaked Timer object (at least until the timer
        // expired and Disposed itself).  But this would be considered bad behavior, as
        // Dispose() is not thread-safe and should not be called concurrently with CancelAfter().

        if (m_timer == null)
        {
            // Lazily initialize the timer in a thread-safe fashion.
            // Initially set to "never go off" because we don't want to take a
            // chance on a timer "losing" the initialization ---- and then
            // cancelling the token before it (the timer) can be disposed.
            Timer newTimer = new Timer(s_timerCallback, this, -1, -1);
            if (Interlocked.CompareExchange(ref m_timer, newTimer, null) != null)
            {
                // We lost the ---- to initialize the timer.  Dispose the new timer.
                newTimer.Dispose();
            }
        }


        // It is possible that m_timer has already been disposed, so we must do
        // the following in a try/catch block.
        try
        {
            m_timer.Change(millisecondsDelay, -1);
        }
        catch (ObjectDisposedException)
        {
            // Just eat the exception.  There is no other way to tell that
            // the timer has been disposed, and even if there were, there
            // would not be a good way to deal with the observe/dispose
            // race condition.
        }

    }

在创建Timer对象时，它takes a lock在单例TimerQueue.Instance上

    internal bool Change(uint dueTime, uint period)
    {
        bool success;

        lock (TimerQueue.Instance)
        {
            if (m_canceled)
                throw new ObjectDisposedException(null, Environment.GetResourceString("ObjectDisposed_Generic"));

            // prevent ThreadAbort while updating state
            try { }
            finally
            {
                m_period = period;

                if (dueTime == Timeout.UnsignedInfinite)
                {
                    TimerQueue.Instance.DeleteTimer(this);
                    success = true;
                }
                else
                {
                    if (FrameworkEventSource.IsInitialized && FrameworkEventSource.Log.IsEnabled(EventLevel.Informational, FrameworkEventSource.Keywords.ThreadTransfer))
                        FrameworkEventSource.Log.ThreadTransferSendObj(this, 1, string.Empty, true);

                    success = TimerQueue.Instance.UpdateTimer(this, dueTime, period);
                }
            }
        }

        return success;
    }

如果您有大量并发HTTP请求，则可能会遇到lock convoy问题。 here和here中描述了此问题。

要确认此操作，请尝试使用VS中的“并行堆栈”窗口调试代码或使用SOS扩展中的syncblock WinDbg命令。

C＃Httpclient Asp.net Core 2.0 on Kestrel Waiting，Lock Contention，高CPU

3 个答案: