Replacing foreach with Threading / Parallelism

时间:2017-06-09 12:43:41

标签: c# .net multithreading parallel.foreach

I'm trying to improve the execution time for a operation.

I get a list of users, and by the user id I'm executing a method from a service which calculates some data. After i get the data by the user id, I set it to every entity, and than just save the context.

Currently with this simple code it take about 1 hour. Approximately 21 - 24 seconds per 100 users. But it depends on each and every user.

using (var ctx = new MyContext())
{
    var users = ctx.Users.Where(u => u.Status != -1).ToList();

    foreach (var user in users)
    {
        user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id);
    }
    await ctx.SaveChangesAsync();
}

I also tried like this, but takes even longer:

Parallel.ForEach(
    users,
    new ParallelOptions { MaxDegreeOfParallelism = -1 },
    user => {
        user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id); }
);

**CalculatePercentage **

public async Task<decimal> CalculatePercentage (int userId)
{
    decimal completeness = 0;
    bool? hasEducationsTags  = null; //--> nullable boolea will handle 3 cases on SP side
    bool? hasPosition    = null; //--> nullable boolea will handle 3 cases on SP side

    ///-- check if user has any education
    if(await SchoolService.HasEducations(userId).ConfigureAwait(false))
    {
        hasEducationsTags = await SchoolService.HasAnyFieldsOfStudy(user=>user.Id == userId);
    }

    ///-- check if user can set position
    if (await UserInfoService.CanSetLocation(userId).ConfigureAwait(false))
    {
        ///-- check if position was set
        string accountName = await UserInfoService.GetAccountName(userId).ConfigureAwait(false);
        hasPosition = await HasUserSetPosition(accountName).ConfigureAwait(false);
    }

    ///-- call store procedure
    using (SqlConnection sqlConnection = ConnectionFactory.Create())
    {
        await sqlConnection.OpenAsync().ConfigureAwait(false);

        using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
        {
            sqlCommand.CommandType = CommandType.StoredProcedure;
            sqlCommand.CommandText = "ProfileCompletionAlgorithm";
            sqlCommand.Parameters.AddWithValue("@USER_ID", userId);
            sqlCommand.Parameters.AddWithValue("@HAS_POSITION", hasPosition);
            sqlCommand.Parameters.AddWithValue("@HAS_SCHOOL_TAGS", hasEducationsTags);

            ///-- not using OUTPUT parameter but by getting the return result
            SqlParameter sqlParameter = sqlCommand.Parameters.Add("@RESULT", SqlDbType.Decimal);
            sqlParameter.Direction    = ParameterDirection.ReturnValue;
            sqlCommand.ExecuteNonQuery();
            completeness              = Convert.ToDecimal(sqlParameter.Value);
        }
    }
    return completeness;
}

Some of the links that I've checked:

How can I optimize this logic in order to get the best time?

If you chose to downvote please explain

Thanks.

1 个答案:

答案 0 :(得分:2)

由于您的代码受I / O限制(不受CPU限制),因此在问题上抛出更多线程(或并行性)并不会有所帮助。你想要的是异步并发(Task.WhenAll),而不是并行并发(Parallel):

using (var ctx = new MyContext())
{
  var users = ctx.Users.Where(u => u.Status != -1);
  var tasks = users.Select(UpdateUserProfilePercentageAsync);
  await Task.WhenAll(tasks);
  await ctx.SaveChangesAsync();
}

private async Task UpdateUserProfilePercentageAsync(User user)
{
  user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id);
}

您可能会发现您的数据库或其他服务并不欣赏受到如此严厉的打击,在这种情况下您可以进行一些限制。例如,一次20个:

private SemaphoreSlim _throttle = new SemaphoreSlim(20);
private async Task UpdateUserProfilePercentageAsync(User user)
{
  await _throttle.WaitAsync();
  try
  {
    user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id);
  }
  finally { _throttle.Release(); }
}