当我运行以下代码时,campaign.Count()为200,000,此代码非常慢。
List<Campaign> listCampaigns = new List<Campaign>();
foreach (var item in campaigns)
{
if (listCampaigns.Where(a => a.CampaignName == item.CampaignName && a.Term == item.Term).Count() == 0)
{
//this doesn't exist
listCampaigns.Add(item);
}
else
{
//this exists already
var campaign = listCampaigns.Where(a => a.CampaignName == item.CampaignName && a.Term == item.Term).First();
campaign.TotalVisits += item.TotalVisits;
List<Conversion> listConversions = item.Conversions.ToList();
listConversions.AddRange(campaign.Conversions.ToList());
campaign.Conversions = listConversions.ToArray();
}
}
是否有优化此代码的部分内容或使用其他方法来加快速度?
任何建议都表示赞赏。感谢。
答案 0 :(得分:9)
这应该明显更快:
List<Campaign> listCampaigns = new List<Campaign>();
foreach (var g in campaigns.GroupBy(c => new { c.CampaignName, c.Term }))
{
var campaign = g.First();
campaign.TotalVisits = g.Sum(x => x.TotalVisits);
campaign.Conversions = g.SelectMany(c => c.Conversions).ToArray();
listCampaigns.Add(campaign);
}
答案 1 :(得分:1)
使用。Dictionary<Tuple<string,Term>,Campaign>
。您可以将CampaignName和Term放入元组,并使用它来查找O(1)中的现有Campaign。这使得整个代码为O(n)。
我们当前的代码是O(n ^ 2),因为它需要遍历整个列表以检查当前条目是否存在。
代码看起来应该类似于:
var dict=new Dictionary<Tuple<string,Term>,Campaign>();
var currentKey=new Tuple<string,Term>(item.CampaignName, item.Term == item.Term);
Campaign existingCampaign;
if (dict.TryGetValue(currentKey,out existingCampaign))
{
//already exists
}
else
{
//new
}
答案 2 :(得分:1)
在将它们添加到主列表之前,您是否可以避免将200,000个广告系列项目转换为具体列表?
我会:
这是新代码:
List<Campaign> listCampaigns = new List<Campaign>();
foreach (var item in campaigns)
{
if (!listCampaigns.Any(a => a.CampaignName == item.CampaignName && a.Term == item.Term))
{
//this doesn't exist
listCampaigns.Add(item);
}
else
{
//this exists already
var campaign = listCampaigns.First(a => a.CampaignName == item.CampaignName && a.Term == item.Term);
campaign.TotalVisits += item.TotalVisits;
//Reduces the number of collection copies created per iteration from 3 to 1
campaign.Conversions = campaignConversions.Concat(item.Conversions).ToArray();
}
}
答案 3 :(得分:1)
在那段代码中:
foreach (var item in campaigns)
{
var campaign = listCampaigns.FirstOrDefault(a => a.CampaignName == item.CampaignName && a.Term == item.Term);
if (campaign == null)
{
//this doesn't exist
listCampaigns.Add(item);
}
else
{
//this exists already
campaign.TotalVisits += item.TotalVisits;
List<Conversion> listConversions = item.Conversions.ToList();
listConversions.AddRange(campaign.Conversions.ToList());
campaign.Conversions = listConversions.ToArray();
}
}
使用FirstOrDefault
避免多次浏览列表。此外,您很可能不会每次都完全评估列表,从而节省了额外的时间。
答案 4 :(得分:0)
至少使用Any()
代替Count()
- 在这种情况下,您无需查看完整列表:
if (listCampaigns.Where(a => a.CampaignName == item.CampaignName
&& a.Term == item.Term).Any())
另外,正如其他人指出快速访问的Dictionary
要快得多,你必须为每个Campaign
定义一个唯一的键值,然后你就可以使用{{ 1}}
答案 5 :(得分:0)
使用Dictionary<TKey,Campaign>
。这样您就可以使用哈希表来检查值是否存在,并在O(1)
代码示例:
var dictCampaigns = new Dictionary<Key, Campaign>();
foreach (var item in campaigns)
{
Campaign found;
var key = new Key(item);
if(!dictCampaigns.TryGetValue(key,out found))
{
dictCampaigns.Add(key, item);
}
else
{
found.TotalVisits += item.TotalVisits;
found.Conversions = (item.Conversions.Concat(found.Conversions)).ToArray();
}
}
我使用Key
结构假设您可能无法使用元组:
struct Key
{
public readonly string Name;
public readonly int Term;
public Key(Campaign camp)
{
Name = camp.CampaignName;
Term = camp.Term;
}
}
我用StopWatch
大致测量它,它比你的代码快两倍,但我认为仍然可以进行优化。