我必须对远程资源进行分页(读取其中的所有条目),除了HTTP(S)
之外我没有其他访问权限。我正在使用共享HTTP GET
(reason)发送的HttpClient
请求,但是在分页过程中,我总是在同一时间收到异常。这让我发疯,因为
如果只是我跳到“错误的”请求,而没有实际阅读所有先前的页面,那么它将起作用。所以我将错误条件指定为:
如果我发送了N个号码或请求,则下一个请求给我一个例外:
System.Net.Http.HttpRequestException: Error while copying content to a stream. ---> System.IO.IOException: Unable to read data from the transport connection: The connection was closed.
我不确定N
所依赖的是什么,但是我将分页大小(每页条目数)更改为一个较大的数字,并且我很快收到了错误消息。使用pageSize = 400
进行分页可用于前43个请求,第44个请求可进行异常,而使用pageSize = 500
进行分页的第35个则可以引发异常。请注意,如果我自己阅读它们,我可以毫无问题地请求这些页面。
目标框架是 .NET 4.8 ,该应用最终将是Owin自托管的WebClient应用,但是我正在通过测试运行分页({{ 1}})。
我正在Polly重试策略中发送HTTP请求,因此它应该解决服务器的任何随机错误,但是在请求Microsoft.VisualStudio.TestTools.UnitTesting
之后,每次重试都会失败! (此外,由于这个问题,我首先添加了重试策略。没有重试策略,我仍然会收到错误消息!。)
N
我在测试中像这样使用它:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
...
using HtmlAgilityPack;
using Polly;
using Serilog;
namespace ......Paging
{
public class ...Paging : I...Paging
{
public Index currentPageIndex = Index.First();
private Task pageRequesTask;
private string currentRequestUrl;
private HtmlDocument currentDocument;
private static readonly string BaseUrl = "..";
private HttpClient client;
public ....Paging(...Configuration configuration)
{
_configuration = configuration;
SetHttpClient();
}
private HttpClient SetHttpClient()
{
var asHttpClientHandler = _configuration.AsHttpClientHandler();
asHttpClientHandler.UseCookies = false;
/*
The ClientHandler:
var proxy = new WebProxy()
{
Address = HttpProxyUri,
BypassProxyOnLocal = false
};
return new HttpClientHandler()
{
Proxy = proxy,
};
Note: The proxy is available and working and I cannot leave it out
*/
client = new HttpClient(asHttpClientHandler, true);
client.DefaultRequestHeaders.Add("User-Agent", "...");
return client;
}
public int PagingSize()
{
return 500;
}
public async Task<bool> IsEnd()
{
if (pageRequesTask == null)
{
pageRequesTask = Next();
}
// Wait for pageRequest
await Task.WhenAll(pageRequesTask);
// Deciding when the paging needs to end
}
public async Task<List<XyResource>> GetResources()
{
if (pageRequesTask == null)
{
pageRequesTask = Next();
}
// Wait for pageRequest
await Task.WhenAll(pageRequesTask);
var currentEntries = currentDocument....;
var result = new List<XyResource>();
// Reading, parsing ..etc
return result;
}
public Task Next()
{
pageRequesTask = RequestNextPage();
return pageRequesTask;
}
public string CurrentRequestUrl()
{
return currentRequestUrl;
}
public int CurrentPageIndexAsZeroBased()
{
return currentPageIndex.AsZeroBased();
}
private async Task RequestNextPage()
{
currentRequestUrl = $@"...pageSize={PagingSize()}&pageNumber={currentPageIndex.AsZeroBased()}";
var httpRetryPolicy = Policy.Handle<System.Exception>().WaitAndRetryAsync(
retryCount: 5,
sleepDurationProvider: attempt => TimeSpan.FromMilliseconds(attempt * 1000), // I wait for some time before retrying
onRetry: (exception, waitDuration) =>
{
Log.ForContext<...>().Error(exception, $"Exception during HTTP request inside retry policy.");
}
);
await httpRetryPolicy.ExecuteAsync(async () =>
{
// The whole try - catch is here, becouse I wanted to test
// What if I re-create the HttpClient when the exceptio is thrown
// but it did not help, even after creating a new HttpClient and
// letting Polly to re-send the request, I still get the error.
try
{
using (var requestMessage = new HttpRequestMessage())
{
requestMessage.Method = HttpMethod.Get;
requestMessage.RequestUri = new Uri(currentRequestUrl);
requestMessage.Headers.Add("Cookie","setting cookies");
using (HttpResponseMessage response = await client.SendAsync(requestMessage))
using (HttpContent content = response.Content)
{
response.EnsureSuccessStatusCode();
var pageContent = await content.ReadAsStringAsync();
var doc = new HtmlDocument();
doc.LoadHtml(pageContent);
currentDocument = doc;
}
}
}
catch (System.Exception e)
{
SetHttpClient();
throw;
}
});
currentPageIndex++;
}
public void Dispose()
{
client.Dispose();
}
}
}
例外
using (var paging = new ..Paging(..)) {
bool isEnd = false;
while (!isEnd) {
var resources = await paging.GetXyResources();
// foreach resources ...etc
var pagingEnd = await paging.IsEnd();
if (resources.Count < 1 || pagingEnd) {
isEnd = true;
} else {
await paging.Next();
}
}
}
IF
编辑:
我尝试更改代码,而不是重用System.Net.Http.HttpRequestException: Error while copying content to a stream. ---> System.IO.IOException: Unable to read data from the transport connection: The connection was closed.
at System.Net.ConnectStream.EndRead(IAsyncResult asyncResult)
at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.EndRead(IAsyncResult asyncResult)
at System.Net.Http.StreamToStreamCopy.BufferReadCallback(IAsyncResult ar)
(每个请求HttpClient
),但仍然像以前一样同时收到错误。
Edit2:
我将代码更改为使用using (var client = new HttpClient())
而不是HttpWebRequest
,并且同时出现相同的错误。
Edit3:
我再次尝试不使用代理。同样的错误。
我尝试在应用程序的开头设置HttpClient
,同样的错误。 (尽管我检查了https://www.ssllabs.com/ssltest/,并且服务器也支持1.0和1.1。)
Edit4:
现在使用System.Net.ServicePointManager.SecurityProtocol = System.Net.SecurityProtocolType.Tls12;
时,我设置了HttpWebRequest
和KeepAlive
,但是没有运气,结果相同。
ProtocolVersion
我开始失去希望了。
Edit5:
看起来错误的原因毕竟是远程服务器...在尝试了多种分页大小之后,我发现它们都在某个条目所在的页面上抛出了错误,无论如何都会导致上述错误什么。
至少我现在对request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
request.ServicePoint.ConnectionLimit = 1;
了解很多。