so I have this code: This is the main function,a parallel for loop that iterates through all the data that needs to be posted and calls a function
ParallelOptions pOpt = new ParallelOptions();
pOpt.MaxDegreeOfParallelism = 30;
Parallel.For(0, maxsize, pOpt, (index,loopstate) => {
//Calls the function where all the webrequests are made
CallRequests(data1,data2);
if (isAborted)
loopstate.Stop();
});
This function is called inside the parallel loop
public static void CallRequests(string data1, string data2)
{
var cookie = new CookieContainer();
var postData = Parameters[23] + data1 +
Parameters[24] + data2;
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(Parameters[25]);
getRequest.Accept = Parameters[26];
getRequest.KeepAlive = true;
getRequest.Referer = Parameters[27];
getRequest.CookieContainer = cookie;
getRequest.UserAgent = Parameters[28];
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version10;
getRequest.AllowAutoRedirect = false;
getRequest.ContentType = Parameters[29];
getRequest.ReadWriteTimeout = 5000;
getRequest.Timeout = 5000;
getRequest.Proxy = null;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
if (getResponse.Headers["Location"] == Parameters[30])
{
//These are simple get requests to retrieve the source code using the same format as above.
//I need to preserve the cookie
GetRequets(data1, data2, Parameters[31], Parameters[13], cookie);
GetRequets(data1, data2, Parameters[32], Parameters[15], cookie);
}
}
From what I have seen and been told,I understand that making these requests async is a better idea than using a parallel loop.My method is also heavy on the proccesor.I wonder how can I make these requests async,but also preserve the multithreaded aspect. I also need to keep the cookie,after the post requests finishes.
答案 0 :(得分:2)
将CallRequests
方法转换为async
实际上只是使用await
关键字切换异步方法的同步方法调用并更改方法签名以返回{{1 }}
这样的事情:
Task
然而,这本身并不能真正让你到任何地方,因为你仍然需要等待主方法中返回的任务。一个非常简单(如果有点生硬)的方式是简单地调用public static async Task CallRequestsAsync(string data1, string data2)
{
var cookie = new CookieContainer();
var postData = Parameters[23] + data1 +
Parameters[24] + data2;
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(Parameters[25]);
getRequest.Accept = Parameters[26];
getRequest.KeepAlive = true;
getRequest.Referer = Parameters[27];
getRequest.CookieContainer = cookie;
getRequest.UserAgent = Parameters[28];
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version10;
getRequest.AllowAutoRedirect = false;
getRequest.ContentType = Parameters[29];
getRequest.ReadWriteTimeout = 5000;
getRequest.Timeout = 5000;
getRequest.Proxy = null;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream =await getRequest.GetRequestStreamAsync(); //open connection
await newStream.WriteAsync(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
if (getResponse.Headers["Location"] == Parameters[30])
{
//These are simple get requests to retrieve the source code using the same format as above.
//I need to preserve the cookie
GetRequets(data1, data2, Parameters[31], Parameters[13], cookie);
GetRequets(data1, data2, Parameters[32], Parameters[15], cookie);
}
}
(或Task.WaitAll()
如果调用方法本身要变为异步)。像这样:
await Task.WhenAll()
然而,这实在是非常直率,并且失去了对并行运行的迭代次数的控制等等。我更喜欢使用TPL dataflow library来做这种事情。该库提供了一种并行链接异步(或同步)操作的方法,并将它们从一个"处理块"到下一个。它有很多选项可用于调整并行度,缓冲区大小等。
详细的曝光超出了这个答案的可能范围,所以我鼓励你阅读它,但一种可能的方法是简单地将其推到一个动作块 - 类似这样:
var tasks = Enumerable.Range(0, maxsize).Select(index => CallRequestsAsync(data1, data2));
Task.WaitAll(tasks.ToArray());
我的答案范围之外的其他几点我应该顺便提及:
var actionBlock = new ActionBlock<int>(async index =>
{
await CallRequestsAsync(data1, data2);
}, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 30,
BoundedCapacity = 100,
});
for (int i=0; i <= maxsize; i++)
{
actionBlock.Post(i); // or await actionBlock.SendAsync(i) if calling method is also async
}
actionBlock.Complete();
actionBlock.Completion.Wait(); // or await actionBlock.Completion if calling method is also async
方法正在使用其结果更新某些外部变量一样。在可能的情况下,最好避免使用此模式,并使方法返回结果以便稍后进行整理(TPL Dataflow库通过CallRequests
处理)。如果更新外部状态是不可避免的,那么请确保您已经考虑了超出我的答案范围的多线程含义(死锁,竞争条件等)。TransformBlock<>
有一些有用的属性在您为问题创建最小描述时丢失了?它是否索引到参数列表或类似的东西?如果是这样,您可以随时直接迭代这些并将index
更改为ActionBlock<int>