这是另一个资源分配问题。我的目标是运行查询以将任何时隙的最高优先级作业分配给两个CPU核心之一(仅作为示例,因此我们假设没有中断或多任务处理)。注意:这与my earlier post about partitioning类似,但重点关注重叠时间和分配多个项目,而不仅仅是最高优先级项目。
这是我们的目标:
public class Job
{
public int Id;
public int Priority;
public DateTime Begin;
public DateTime End;
}
真正的数据集非常大,但是对于这个例子,假设有1000个作业要分配给两个CPU核心。它们都被加载到内存中,我需要针对它们运行单个LINQ to Objects查询。目前这需要近8秒和140万次比较。
我利用this post中引用的逻辑来确定两个项目是否重叠,但与该帖子不同,我不仅需要查找重叠项目,还要安排任何重叠集合的顶部项目,然后安排下一个。
在我开始编写代码之前,让我指出当前inneficient算法的步骤:
问题:
完整示例代码:
public class Job
{
public static long Iterations;
public int Id;
public int Priority;
public DateTime Begin;
public DateTime End;
public bool Overlaps(Job other)
{
Iterations++;
return this.End > other.Begin && this.Begin < other.End;
}
}
public class Assignment
{
public Job Job;
public int Core;
}
class Program
{
static void Main(string[] args)
{
const int Jobs = 1000;
const int Cores = 2;
const int ConcurrentJobs = Cores + 1;
const int Priorities = Cores + 3;
DateTime startTime = new DateTime(2011, 3, 1, 0, 0, 0, 0);
Console.WriteLine(string.Format("{0} Jobs x {1} Cores", Jobs, Cores));
var timer = Stopwatch.StartNew();
Console.WriteLine("Populating data");
var jobs = new List<Job>();
for (int jobId = 0; jobId < Jobs; jobId++)
{
var jobStart = startTime.AddHours(jobId / ConcurrentJobs).AddMinutes(jobId % ConcurrentJobs);
jobs.Add(new Job() { Id = jobId, Priority = jobId % Priorities, Begin = jobStart, End = jobStart.AddHours(0.5) });
}
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
timer.Restart();
Console.WriteLine("Assigning Jobs to Cores");
IEnumerable<Assignment> assignments = null;
for (int core = 0; core < Cores; core++)
{
// avoid modified closures by creating local variables
int localCore = core;
var localAssignments = assignments;
// Step 1: Determine the remaining jobs
var remainingJobs = localAssignments == null ?
jobs :
from j in jobs where !(from a in localAssignments select a.Job).Contains(j) select j;
// Step 2: Assign the top priority job in any time-slot to the core
var assignmentsForCore = from s1 in remainingJobs
where
(from s2 in remainingJobs
where s1.Overlaps(s2)
orderby s2.Priority
select s2).First().Equals(s1)
select new Assignment { Job = s1, Core = localCore };
// Step 3: Accumulate the results (unfortunately requires a .ToList() to avoid massive over-joins)
assignments = assignments == null ? assignmentsForCore.ToList() : assignments.Concat(assignmentsForCore.ToList());
}
// This is where I'd like to Execute the query one single time across all cores, but have to do intermediate steps to avoid massive-over-joins
assignments = assignments.ToList();
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
Console.WriteLine("\nJobs:");
foreach (var job in jobs.Take(20))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3}", job.Begin, job.End, job.Id, job.Priority));
}
Console.WriteLine("\nAssignments:");
foreach (var assignment in assignments.OrderBy(a => a.Job.Begin).Take(10))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3} C{4}", assignment.Job.Begin, assignment.Job.End, assignment.Job.Id, assignment.Job.Priority, assignment.Core));
}
Console.WriteLine(string.Format("\nTotal Comparisons: {0:n}", Job.Iterations));
Console.WriteLine("Any key to continue");
Console.ReadKey();
}
}
示例输出:
1000个工作x 2个核心
填充数据
完成0.00ms
将工作分配给核心
完成7,998.00ms
乔布斯:
3/1/2011 12:00:00 AM-3/1/2011 12:30:00 AM Id 0 P0
3/1/2011 12:01:00 AM-3/1/2011 12:31:00 AM Id 1 P1
3/1/2011 12:02:00 AM-3/1/2011 12:32:00 AM Id 2 P2
3/1/2011 1:00:00 AM-3/1/2011 1:30:00 AM Id 3 P3
3/1/2011 1:01:00 AM-3/1/2011 1:31:00 AM Id 4 P4
3/1/2011 1:02:00 AM-3/1/2011 1:32:00 AM Id 5 P0
3/1/2011 2:00:00 AM-3/1/2011 2:30:00 AM Id 6 P1
3/1/2011 2:01:00 AM-3/1/2011 2:31:00 AM Id 7 P2
3/1/2011 2:02:00 AM-3/1/2011 2:32:00 AM Id 8 P3
3/1/2011 3:00:00 AM-3/1/2011 3:30:00 AM Id 9 P4
3/1/2011 3:01:00 AM-3/1/2011 3:31:00 AM Id 10 P0
3/1/2011 3:02:00 AM-3/1/2011 3:32:00 AM Id 11 P1
3/1/2011 4:00:00 AM-3/1/2011 4:30:00 AM Id 12 P2
3/1/2011 4:01:00 AM-3/1/2011 4:31:00 AM Id 13 P3
3/1/2011 4:02:00 AM-3/1/2011 4:32:00 AM Id 14 P4
3/1/2011 5:00:00 AM-3/1/2011 5:30:00 AM Id 15 P0
3/1/2011 5:01:00 AM-3/1/2011 5:31:00 AM Id 16 P1
3/1/2011 5:02:00 AM-3/1/2011 5:32:00 AM Id 17 P2
3/1/2011 6:00:00 AM-3/1/2011 6:30:00 AM Id 18 P3
3/1/2011 6:01:00 AM-3/1/2011 6:31:00 AM Id 19 P4
作业:
3/1/2011 12:00:00 AM-3/1/2011 12:30:00 AM Id 0 P0 C0
3/1/2011 12:01:00 AM-3/1/2011 12:31:00 AM Id 1 P1 C1
3/1/2011 1:00:00 AM-3/1/2011 1:30:00 AM Id 3 P3 C1
3/1/2011 1:02:00 AM-3/1/2011 1:32:00 AM Id 5 P0 C0
3/1/2011 2:00:00 AM-3/1/2011 2:30:00 AM Id 6 P1 C0
3/1/2011 2:01:00 AM-3/1/2011 2:31:00 AM Id 7 P2 C1
3/1/2011 3:01:00 AM-3/1/2011 3:31:00 AM Id 10 P0 C0
3/1/2011 3:02:00 AM-3/1/2011 3:32:00 AM Id 11 P1 C1
3/1/2011 4:00:00 AM-3/1/2011 4:30:00 AM Id 12 P2 C0
3/1/2011 4:01:00 AM-3/1/2011 4:31:00 AM Id 13 P3 C1
3/1/2011 5:00:00 AM-3/1/2011 5:30:00 AM Id 15 P0 C0
总比较:1,443,556.00
任何要继续的关键
答案 0 :(得分:1)
是否有理由将linq用于此任务的对象集合?我认为我会创建一个活动列表,将所有作业放入队列中,并在活动列表低于10时将下一个作业弹出队列并将其粘贴到活动列表中。很容易跟踪哪个核心正在执行哪个任务,并将队列中的下一个任务分配给最不忙的核心。将已完成的事件连接到作业或仅监视活动列表,您将知道何时从队列中弹出另一个作业并进入活动列表。
答案 1 :(得分:0)
我宁愿在一个循环中完成它。我的产生了与你不同的结果。你的预定所有工作的2/3。我的预定全部。我稍后会补充说明。现在就去预约。
public class Job
{
public static long Iterations;
public int Id;
public int Priority;
public DateTime Begin;
public DateTime End;
public bool Overlaps(Job other)
{
Iterations++;
return this.End > other.Begin && this.Begin < other.End;
}
}
public class Assignment : IComparable<Assignment>
{
public Job Job;
public int Core;
#region IComparable<Assignment> Members
public int CompareTo(Assignment other)
{
return Job.Begin.CompareTo(other.Job.Begin);
}
#endregion
}
class Program
{
static void Main(string[] args)
{
const int Jobs = 1000;
const int Cores = 2;
const int ConcurrentJobs = Cores + 1;
const int Priorities = Cores + 3;
DateTime startTime = new DateTime(2011, 3, 1, 0, 0, 0, 0);
Console.WriteLine(string.Format("{0} Jobs x {1} Cores", Jobs, Cores));
var timer = Stopwatch.StartNew();
Console.WriteLine("Populating data");
var jobs = new List<Job>();
for (int jobId = 0; jobId < Jobs; jobId++)
{
var jobStart = startTime.AddHours(jobId / ConcurrentJobs).AddMinutes(jobId % ConcurrentJobs);
jobs.Add(new Job() { Id = jobId, Priority = jobId % Priorities, Begin = jobStart, End = jobStart.AddHours(0.5) });
}
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
timer.Reset();
Console.WriteLine("Assigning Jobs to Cores");
List<Assignment>[] assignments = new List<Assignment>[Cores];
for (int core = 0; core < Cores; core++)
assignments[core] = new List<Assignment>();
Job[] lastJobs = new Job[Cores];
foreach (Job j in jobs)
{
Job job = j;
bool assigned = false;
for (int core = 0; core < Cores; core++)
{
if (lastJobs[core] == null || !lastJobs[core].Overlaps(job))
{
// Assign directly if no last job or no overlap with last job
lastJobs[core] = job;
assignments[core].Add(new Assignment { Job = job, Core = core });
assigned = true;
break;
}
else if (job.Priority > lastJobs[core].Priority)
{
// Overlap and higher priority, so we replace
Job temp = lastJobs[core];
lastJobs[core] = job;
job = temp; // Will try to later assign to other core
assignments[core].Add(new Assignment { Job = job, Core = core });
assigned = true;
break;
}
}
if (!assigned)
{
// TODO: What to do if not assigned? Your code seems to just ignore them
}
}
List<Assignment> merged = new List<Assignment>();
for (int core = 0; core < Cores; core++)
merged.AddRange(assignments[core]);
merged.Sort();
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
timer.Reset();
Console.WriteLine(string.Format("\nTotal Comparisons: {0:n}", Job.Iterations));
Job.Iterations = 0; // Reset to count again
{
IEnumerable<Assignment> assignments2 = null;
for (int core = 0; core < Cores; core++)
{
// avoid modified closures by creating local variables
int localCore = core;
var localAssignments = assignments2;
// Step 1: Determine the remaining jobs
var remainingJobs = localAssignments == null ?
jobs :
from j in jobs where !(from a in localAssignments select a.Job).Contains(j) select j;
// Step 2: Assign the top priority job in any time-slot to the core
var assignmentsForCore = from s1 in remainingJobs
where
(from s2 in remainingJobs
where s1.Overlaps(s2)
orderby s2.Priority
select s2).First().Equals(s1)
select new Assignment { Job = s1, Core = localCore };
// Step 3: Accumulate the results (unfortunately requires a .ToList() to avoid massive over-joins)
assignments2 = assignments2 == null ? assignmentsForCore.ToList() : assignments2.Concat(assignmentsForCore.ToList());
}
// This is where I'd like to Execute the query one single time across all cores, but have to do intermediate steps to avoid massive-over-joins
assignments2 = assignments2.ToList();
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
Console.WriteLine("\nJobs:");
foreach (var job in jobs.Take(20))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3}", job.Begin, job.End, job.Id, job.Priority));
}
Console.WriteLine("\nAssignments:");
foreach (var assignment in assignments2.OrderBy(a => a.Job.Begin).Take(10))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3} C{4}", assignment.Job.Begin, assignment.Job.End, assignment.Job.Id, assignment.Job.Priority, assignment.Core));
}
if (merged.Count != assignments2.Count())
System.Console.WriteLine("Difference count {0}, {1}", merged.Count, assignments2.Count());
for (int i = 0; i < merged.Count() && i < assignments2.Count(); i++)
{
var a2 = assignments2.ElementAt(i);
var a = merged[i];
if (a.Job.Id != a2.Job.Id)
System.Console.WriteLine("Difference at {0} {1} {2}", i, a.Job.Begin, a2.Job.Begin);
if (i % 100 == 0) Console.ReadKey();
}
}
Console.WriteLine(string.Format("\nTotal Comparisons: {0:n}", Job.Iterations));
Console.WriteLine("Any key to continue");
Console.ReadKey();
}
}
由于重大错误而删除。重做它......:P