查找包含给定时间点的间隔的最快方法

时间:2015-01-28 09:28:04

标签: c# algorithm datetime intervals

我需要使用以下方法创建基于时间的“计划结构”:

Void  addTask(DateTime startTime, int durationInMinutes, TaskObject myObj)
{
   // add TaskObject to calendar structure
}
List<TaskObject> getRunningTasks (DateTime startTime, DateTime endTime)
{
  //this procedure have to efficiently return list of running tasks in specified time frame
}
List<TaskObject> getRunningTasks (DateTime exactTime)
{
    return getRunningTaks(exactTime,exactTime);
}

我有大约60k个TaskObjects需要计算,需要在几小时和几分钟内重新计算(getRunningTasks将被调用大约400k次)

现在我用:

public Dictionary<long, Dictionary<int, Dictionary<int, List< TaskObject>>>> scheduleTable;

scheduleTable [dayInTicks] [小时] [分钟]

我将所有匹配的任务添加到每个小时和分钟,以及它们的安排位置。

来自DrKoch的想法

    public class TaskList
    {
        private SortedDictionary<DateTime, TaskObject> startTimes;
        private SortedDictionary<DateTime, TaskObject> endTimes;
        private SortedSet<DateTime> startTimeIndexes;
        private SortedSet<DateTime> endTimeIndexes;
        public TaskList()
        {
            reset();
        }
        public void addTask(TaskObject taskToAdd, DateTime startTime, int durationInMinutes)
        {
            // start time
            while (startTimes.ContainsKey(startTime))
            {
                startTime = startTime.AddTicks(1);
            }
            startTimes.Add(startTime, taskToAdd);
            startTimeIndexes.Add(startTime);
            //end time
            DateTime endTime = startTime.AddMinutes(durationInMinutes);
            while (endTimes.ContainsKey(endTime))
            {
                endTime = endTime.AddTicks(1);
            }
            endTimes.Add(endTime, taskToAdd);
            endTimeIndexes.Add(endTime);
        }
        public List<TaskObject> getRunningTasks(DateTime startTime, DateTime endTime)
        {
            DateTime fromBeginingOfDay = new DateTime(endTime.Year, endTime.Month, endTime.Day);
            SortedSet<DateTime> myEndTimeIndexes =  endTimeIndexes.GetViewBetween(fromBeginingOfDay, startTime); // tasks, that already finished during specified day
            SortedSet<DateTime> myStartTimeIndexes = endTimeIndexes.GetViewBetween(fromBeginingOfDay, endTime);  // tasks, that started from the beginig of the day
            List<TaskObject> result = new List<TaskObject>();
            // Fill result with all matching tasks
            foreach (DateTime myStartTimeIndex in myStartTimeIndexes)
            {
                result.Add(startTimes[myStartTimeIndex]);
            }
            // Remove finished tasks from result
            foreach (DateTime myEndTimeIndex in myEndTimeIndexes)
            {
                if (result.Contains(endTimes[myEndTimeIndex]))
                {
                    result.Remove(startTimes[myEndTimeIndex]);
                }
            }
            return result;
        }
        public List<TaskObject> getRunningTasks(DateTime exactTime)
        {
            return getRunningTasks(exactTime, exactTime.addSeconds(1));
        }
        public void reset()
        {
            startTimes = new SortedDictionary<DateTime, TaskObject>();
            endTimes = new SortedDictionary<DateTime, TaskObject>();
            startTimeIndexes = new SortedSet<DateTime>();
            endTimeIndexes = new SortedSet<DateTime>();
        }
    }
    public class TaskObject
    {
        public string Name;
        public TaskObject(string name)
        {
            Name = name;
        }
    }

3 个答案:

答案 0 :(得分:16)

假设您将任务存储在这样的类中:

public class MyTask
{
    public string name;
    public DateTime startDt;
    public DateTime endDt;
    // ...
}

基本思路是维护两个集合中的任务,其中一个按startDtendDt排序。

我们将使用SortedSet有两个原因:

  1. 它有一个computational complexity的O(log n)用于插入和 搜索。如果您遇到许多物品的问题,那是非常需要的 复杂度优于O(n)。

  2. 它允许返回某个“范围”中的所有项目。不需要知道 用于检索的确切“关键字”,如字典

  3. 由于SortedSet中的所有项目都是唯一的,并且由于多个任务可能具有相同的startDtendDt,因此我们无法直接在SortedSet中存储任务在同一时间保持所有任务的“桶”:

    public class SameTimeTaskList
    {
        public DateTime time; // common start or end time of all tasks in list
        public List<MyTask> taskList = new List<MyTask>();
    }
    

    对此的排序标准当然是time

    // Defines a comparer to create a sorted set 
    // that is sorted by time. 
    public class ByTime : IComparer<SameTimeTaskList>
    {
        public int Compare(SameTimeTaskList x, SameTimeTaskList y)
        {
            return x.time.CompareTo(y.time);
        }
    }
    

    有了这个,我们可以构建我们的两个排序集:

    SortedSet<SameTimeTaskList> startTimeSet = 
      new SortedSet<SameTimeTaskList>(new ByTime());
    SortedSet<SameTimeTaskList> endTimeSet = 
      new SortedSet<SameTimeTaskList>(new ByTime());
    

    两个集合中都插入了一个新任务。如果此time没有存储桶,则会创建新存储桶。否则,只需将任务添加到正确的存储桶中:

        public void Add(MyTask task)
        {
            // startTimeSet
            refTime.time = task.startDt;
            var lst = startTimeSet.GetViewBetween(refTime,
                refTime).FirstOrDefault();
            if (lst == null) // no bucket found for time
            {
                lst = new SameTimeTaskList { time = task.startDt };
                startTimeSet.Add(lst);
            }
            lst.taskList.Add(task); // add task to bucket
            // endTimeSet
            refTime.time = task.endDt;
            lst = endTimeSet.GetViewBetween(refTime,
                refTime).FirstOrDefault();
            if (lst == null) // no bucket found for time
            {
                lst = new SameTimeTaskList { time = task.endDt };
                endTimeSet.Add(lst);
            }
            lst.taskList.Add(task); // add task to bucket
        }
    

    现在很容易获得在某个exactTime处有效的所有intervalls。每项任务必须满足两个条件:

    task.startDt <= exactTime
    &&
    task.endDt >= exactTime
    

    我们检查两个SortedSets以查看哪个返回一个条件的较小集合。然后,如果它与第二个条件匹配,我们检查较小集合中的所有任务:

        public IEnumerable<MyTask> Get(DateTime exactTime)
        {
            refTime.time = exactTime;
            // set of all tasks started before exactTime
            SortedSet<SameTimeTaskList> sSet =
               startTimeSet.GetViewBetween(minTime, refTime);
            // set of all tasks ended after exactTime
            SortedSet<SameTimeTaskList> eSet =
               endTimeSet.GetViewBetween(refTime, maxTime);
    
            List<MyTask> result = new List<MyTask>();
            if (sSet.Count < eSet.Count) // check smaller set for 2nd condition
            {
                foreach (var tl in sSet)
                    foreach (MyTask tsk in tl.taskList)
                        if(tsk.endDt >= exactTime) result.Add(tsk);
            }
            else // eSet is smaller
            {
                foreach (var tl in eSet)
                    foreach (MyTask tsk in tl.taskList)
                        if (tsk.startDt <= exactTime) result.Add(tsk);
    
            }
            return result;
        }
    

    以下是完整的代码作为工作程序:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    
    namespace IntervallsTest
    {
        class Program
        {
            static void Main(string[] args)
            {
                DateTime exactDate = DateTime.Parse("2015-6-1");
    
                var tc = new TaskCollection();
                tc.Add(new MyTask { name = "T1", startDt = DateTime.Parse("2015-1-1"), endDt = DateTime.Parse("2015-02-01") });
                tc.Add(new MyTask { name = "T2", startDt = DateTime.Parse("2015-1-1"), endDt = DateTime.Parse("2015-07-01") });
                tc.Add(new MyTask { name = "T2a", startDt = DateTime.Parse("2015-1-1"), endDt = DateTime.Parse("2015-07-02") });
                tc.Add(new MyTask { name = "T3", startDt = DateTime.Parse("2015-05-1"), endDt = DateTime.Parse("2015-12-31") });
                tc.Add(new MyTask { name = "T3a", startDt = DateTime.Parse("2015-04-1"), endDt = DateTime.Parse("2015-12-31") });
                tc.Add(new MyTask { name = "T4", startDt = DateTime.Parse("2015-12-1"), endDt = DateTime.Parse("2015-12-31") });
    
                var result = tc.Get(exactDate);
    
                Console.WriteLine("These tasks are active at " + exactDate);
                foreach (var tsk in result)
                {
                    Console.WriteLine(tsk.name);
                }
                Console.WriteLine("press any key");
                Console.ReadKey();
            }
        }
    
        public class TaskCollection
        {
            SortedSet<SameTimeTaskList> startTimeSet = new SortedSet<SameTimeTaskList>(new ByTime());
            SortedSet<SameTimeTaskList> endTimeSet = new SortedSet<SameTimeTaskList>(new ByTime());
    
            static SameTimeTaskList refTime = new SameTimeTaskList();
            static SameTimeTaskList minTime = new SameTimeTaskList { time = DateTime.MinValue };
            static SameTimeTaskList maxTime = new SameTimeTaskList { time = DateTime.MaxValue };
    
            public void Add(MyTask task)
            {
                // startTimeSet
                refTime.time = task.startDt;
                var lst = startTimeSet.GetViewBetween(refTime, refTime).FirstOrDefault();
                if (lst == null) // no bucket found for time
                {
                    lst = new SameTimeTaskList { time = task.startDt };
                    startTimeSet.Add(lst);
                }
                lst.taskList.Add(task); // add task to bucket
                // endTimeSet
                refTime.time = task.endDt;
                lst = endTimeSet.GetViewBetween(refTime, refTime).FirstOrDefault();
                if (lst == null) // no bucket found for time
                {
                    lst = new SameTimeTaskList { time = task.endDt };
                    endTimeSet.Add(lst);
                }
                lst.taskList.Add(task); // add task to bucket
            }
    
            public IEnumerable<MyTask> Get(DateTime exactTime)
            {
                refTime.time = exactTime;
                // set of all tasks started before exactTime
                SortedSet<SameTimeTaskList> sSet = startTimeSet.GetViewBetween(minTime, refTime);
                // set of all tasks ended after exactTime
                SortedSet<SameTimeTaskList> eSet = endTimeSet.GetViewBetween(refTime, maxTime);
    
                List<MyTask> result = new List<MyTask>();
                if (sSet.Count < eSet.Count) // check smaller set for 2nd condition
                {
                    foreach (var tl in sSet)
                        foreach (MyTask tsk in tl.taskList)
                            if(tsk.endDt >= exactTime) result.Add(tsk);
                }
                else // eSet is smaller
                {
                    foreach (var tl in eSet)
                        foreach (MyTask tsk in tl.taskList)
                            if (tsk.startDt <= exactTime) result.Add(tsk);
    
                }
                return result;
            }
        }
    
        public class MyTask
        {
            public string name;
            public DateTime startDt;
            public DateTime endDt;
            // ...
        }
    
        public class SameTimeTaskList
        {
            public DateTime time; // common start or end time of all tasks in list
            public List<MyTask> taskList = new List<MyTask>();
        }
    
        // Defines a comparer to create a sorted set 
        // that is sorted by time. 
        public class ByTime : IComparer<SameTimeTaskList>
        {
            public int Compare(SameTimeTaskList x, SameTimeTaskList y)
            {
                return x.time.CompareTo(y.time);
            }
        }
    }
    

    当您将此与您尝试过的所有其他版本进行比较时,我会非常有兴趣看到基准测试结果。

答案 1 :(得分:0)

<强>注: 有关更完整的解决方案,请参阅我的second answer


您可以构建另外两个已排序词典:

SortedDictionary<DateTime, Task> startTimes; // startTime -> Task
SortedDictionary<DateTime, Task> endTimes;   // endTime -> Task

这些Dictinaries允许快速(O(log N))访问在exactTime之前开始并在exactTime之后结束的所有任务

您正在寻找这些集合的交集。

<小时/>

更好的收藏是SortedSet它有一个

SortedSet<T>.GetViewBetween()

执行所有操作的方法:它可以使用startTimesSet之前的startTime返回exactTime中的所有任务。

答案 2 :(得分:0)

此问题类似于选择二维点,落在指定的矩形内。不幸的是,它无法通过二进制搜索直接解决。

解决这个问题的主要方法是将“平原”划分为正方形。一个小例子:

// minDate means minimal possible date
// maxDate means maximal possible date
// interval means a unit of division in days, f.e. 10 or 30
var size = (maxDate.Subtract(minDate).Days + interval)/interval;
var tasks = new List<Task>[size, size]();

// for each new task:
var startDate = ...
var endDate = ...
var x = (startDate.Subtract(minDate).Days + interval)/interval;
var y = (endDate.Subtract(minDate).Days + interval)/interval;

if (tasks[x, y] == null)
    tasks[x, y] = new List<Task>();

tasks[x, y].Add(newTask);

// search
var startPeriod = ...
var endPeriod = ...
var minIndex = (startPeriod.Subtract(minDate).Days + interval)/interval;
var maxIndex = (endPeriod.Subtract(minDate).Days + interfal)/interval;

for (int x = minIndex + 1; x < maxIndex - 1; x++)
    for (int y = minIndex + 1; y < maxIndex - 1; y++)
        tasks[x, y] ... // All these tasks are yours

for (int x = minIndex; x < maxIndex; x++)
    foreach(var task in tasks[x, minIndex])
        if (task.startDate >= startPeriod && task.endDate <= endPeriod)
            ... // All these tasks also are yours

// Repeat last for/foreach for every boundary interval, since not all tasks
// may be yours there
...

在内部边界“正方形”中,您需要用暴力查找所需的任务。如果速度太慢,您可以使用SortedList代替List。它会减少暴力破坏的时间,但不会完全消除它。