我正在使用HttpWebRequest.BeginGetResponse验证代理列表。它工作得很好,我可以在几秒钟内验证数千个代理并且不会阻塞。
在我的项目中的另一个类中,我调用相同的代码并阻止它。
代理验证方法(不会阻止)
public void BeginTest(IProxyTest test, Action<ProxyStatus> callback, int timeout = 10000)
{
var req = HttpWebRequest.Create(test.URL);
req.Proxy = new WebProxy(this.ToString());
req.Timeout = timeout;
WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
{
if (callbackState.Exception != null)
{
callback(ProxyStatus.Invalid);
}
else
{
var responseStream = callbackState.ResponseStream;
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains(test.Validation))
{
callback(ProxyStatus.Valid);
}
else
{
callback(ProxyStatus.Invalid);
}
}
}
}));
}
WebHelper.BeginGetResponse
public static void BeginGetResponse(WebRequest request, Action<RequestCallbackState> responseCallback)
{
Task<WebResponse> asyncTask = Task.Factory.FromAsync<WebResponse>(request.BeginGetResponse, request.EndGetResponse, null);
ThreadPool.RegisterWaitForSingleObject((asyncTask as IAsyncResult).AsyncWaitHandle, new WaitOrTimerCallback(TimeoutCallback), request, request.Timeout, true);
asyncTask.ContinueWith(task =>
{
WebResponse response = task.Result;
Stream responseStream = response.GetResponseStream();
responseCallback(new RequestCallbackState(responseStream));
responseStream.Close();
response.Close();
}, TaskContinuationOptions.NotOnFaulted);
//Handle errors
asyncTask.ContinueWith(task =>
{
var exception = task.Exception;
responseCallback(new RequestCallbackState(exception.InnerException));
}, TaskContinuationOptions.OnlyOnFaulted);
}
具有类似方法的其他类也调用WebHelper.BeginGetResponse,但阻止(为什么?)
public void BeginTest(Action<ProxyStatus> callback, int timeout = 10000)
{
var req = HttpWebRequest.Create(URL);
req.Timeout = timeout;
WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
{
if (callbackState.Exception != null)
{
callback(ProxyStatus.Invalid);
}
else
{
var responseStream = callbackState.ResponseStream;
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains(Validation))
{
callback(ProxyStatus.Valid);
}
else
{
callback(ProxyStatus.Invalid);
}
}
}
}));
}
调用阻止
的代码private async void validateTestsButton_Click(object sender, EventArgs e)
{
await Task.Run(() =>
{
foreach (var test in tests)
{
test.BeginTest((status) => test.Status = status);
}
});
}
调用不阻止的代码:
public static async Task BeginTests(ICollection<Proxy> proxies, ICollection<ProxyJudge> judges, int timeout = 10000, IProgress<int> progress = null)
{
await Task.Run(() =>
{
foreach (var proxy in proxies)
{
proxy.BeginTest(judges.GetRandomItem(), new Action<ProxyStatus>(status =>
{
proxy.Status = status;
}), timeout);
}
});
}
答案 0 :(得分:1)
虽然这个dosnt确切地解决了你的问题,但它可能会帮到你一点点
以下是一些问题
ThreadPool
类似乎有点过时了因此,您正在进行 IO绑定工作,您可能猜到的最佳模式是TBA async
await
模式。基本上每次等待一个 IO Completion 端口时,你都希望将该线程返回给操作系统并对你的系统很好,从而腾出资源以满足其需要。
此外,你显然想要一定程度的并行性,你最好至少对它有一些控制权。
我建议 TPL Dataflow 和 ActionBlock
这是一项不错的工作鉴于
public class Proxy
{
public ProxyStatus ProxyStatus { get; set; }
public string ProxyUrl { get; set; }
public string WebbUrl { get; set; }
public string Error { get; set; }
}
ActionBlock示例
public static async Task DoWorkLoads(List<Proxy> results)
{
var options = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 50
};
var block = new ActionBlock<Proxy>(CheckUrlAsync, options);
foreach (var proxy in results)
{
block.Post(proxy);
}
block.Complete();
await block.Completion;
}
CheckUrlAsync示例
// note i havent tested this, add pepper and salt to taste
public static async Task CheckUrlAsync(Proxy proxy)
{
try
{
var request = WebRequest.Create(proxy.Url);
if (proxy.ProxyUrl != null)
request.Proxy = new WebProxy(proxy.ProxyUrl);
using (var response = await request.GetResponseAsync())
{
using (var responseStream = response.GetResponseStream())
{
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains("asdasd"))
proxy.ProxyStatus = ProxyStatus.Valid;
else
proxy.ProxyStatus = ProxyStatus.Invalid;
}
}
}
}
catch (Exception e)
{
proxy.ProxyStatus = ProxyStatus.Error;
proxy.Error = e.Message;
}
}
<强>用法强>
await DoWorkLoads(proxies to test);
<强>摘要强>
代码更整洁,你不得不在整个地方投掷动作,你使用async
和await
抛弃了APM,你可以控制平行度并且你对线程池
答案 1 :(得分:0)
我通过将代码包装在BeginTest方法中来解决问题,该方法在Action中神秘地阻塞,然后在该Action上调用BeginInvoke。
我推断这是因为没有在该方法中设置HttpWebRequest上的Proxy属性,这似乎导致我的系统代理同步查找。
public void BeginTest(Action<ProxyStatus> callback, int timeout = 10000)
{
var action = new Action(() =>
{
var req = HttpWebRequest.Create(URL);
req.Timeout = timeout;
WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
{
if (callbackState.Exception != null)
{
callback(ProxyStatus.Invalid);
}
else
{
var responseStream = callbackState.ResponseStream;
using (var reader = new StreamReader(responseStream))
{
var responseString = reader.ReadToEnd();
if (responseString.Contains(Validation))
{
callback(ProxyStatus.Valid);
}
else
{
callback(ProxyStatus.Invalid);
}
}
}
}));
});
action.BeginInvoke(null, null);
}