背景

Question

我需要帮助来增强我正在进行的线程调度策略。

背景

要设置上下文，我需要执行一些（20-30万）“任务”。每个任务都可以独立执行。实际上，任务的执行时间范围在40毫秒到5分钟之间。此外，重新运行时的每个单独任务都需要相同的时间。

我需要选项来控制这些任务的执行方式，所以我想出了一个调度引擎，它根据各种策略调度这些任务。最基本的策略是FCFS，即我的任务逐个顺序执行。第二个是批处理策略，调度程序的桶大小为“b”，它控制并行运行的线程数。调度程序将为其获得的第一个“b”任务启动非阻塞线程，然后等待那些已启动的任务完成，然后继续执行下一个“b”任务，并行启动它们，然后等待完成。每次处理的每个“b”组任务称为批处理，因此称为批处理调度。

现在，通过批处理调度，在批处理开始时，当线程开始创建时，活动开始增加，然后在大多数线程运行时在中间达到峰值，然后在我们阻塞和等待时消失线程重新加入。当批处理大小“b”= 1时，批处理调度成为FCFS调度。

改进批量调度的一种方法是我称之为并行调度 - 如果存在足够数量的任务，调度程序将确保“b”个线程数将在任何时间点继续运行。最初的线程数将增加到“b”，然后将计数保持在运行线程的“b”，直到最后一组任务完成执行。为了在任何时候保持“b”线程的执行，我们需要在旧线程完成执行时启动一个新线程。与批处理调度（平均情况场景）相比，这种方法可以减少完成处理所有任务所需的时间。

我需要帮助的部分

我必须实现并行调度的逻辑。如果有人能帮助我，我将不得不承担责任：

我们可以避免使用 startedTasks列表？我正在使用它因为我需要确定什么时候 Commit（）退出，所有任务都有完成执行，所以我只是循环通过所有已启动的任务和阻止直到他们完成。一个电流问题是列表会很长。

- 或 -

有更好的方法进行并行调度吗？

（也欢迎任何其他建议/策略 - 这里的主要目标是缩短批量大小“b”的限制内的总体执行持续时间）

ParallelScheduler伪代码

// assume all variable access/updates are thread safe
Semaphore S: with an initial capacity of "b"
Queue<Task> tasks
List<Task> startedTasks
bool allTasksCompleted = false;

// The following method is called by a callee
// that wishes to start tasks, it can be called any number of times
// passing various task items
METHOD void ScheduleTask( Task t )

    if the PollerThread not started yet then start it 
    // starting PollerThead will call PollerThread_Action


    // set up the task so that when it is completed, it releases 1
    // on semaphore S
    // assume OnCompleted is executed when the task t completes
    // execution after a call to t.Start()
    t.OnCompleted() ==> S.Release(1)

    tasks.Enqueue ( t )


// This method is called when the callee
// wishes to notify that no more tasks are present that needs
// a ScheduleTask call.
METHOD void Commit()
    // assume that the following assignment is thread safe
    stopPolling = true;

    // assume that the following check is done efficiently
    wait until allTasksCompleted is set to true


// this is the method the poller thread once started will execute
METHOD void PollerThread_Action
    while ( !stopPolling )
        if ( tasks.Count > 0 )
            Task nextTask = tasks.Deque()
            // wait on the semaphore to relase one unit
            if ( S.WaitOne() )              
                // start the task in a new thread
                nextTask.Start()
                startedTasks.Add( nextTask )
    // we have been asked to start polling
    // this means no more tasks are going to be added
    // to the queue
    // finish off the remaining tasks
    while ( tasks.Count > 0 )
        Task nextTask = tasks.Dequeue()
        if ( S.WaitOne() )
            nextTask.Start()
            startedTasks.Add ( nextTask )

    // at this point, there are no more tasks in the queue
    // each task would have already been started at some
    // point
    for every Task t in startedTasks
        t.WaitUntilComplete() // this will block if a task is running, else exit immediately

    // now all tasks are complete
    allTasksCompleted = true

Answer 1

搜索“工作窃取调度程序” - 它是最有效的通用调度程序之一。还有几个开源和商业实现。

这个想法是拥有固定数量的工作线程，从队列中获取任务。但是为了避免所有线程共享的单个队列的拥塞（多CPU系统的非常糟糕的性能问题） - 每个线程都有自己的队列。当线程创建新任务时 - 它将它们放置在自己的队列中。完成任务后，线程从其自己的队列中获取下一个任务。但是如果线程的队列是空的，它会从其他线程的队列中“窃取”工作。

Answer 2

当您的程序知道需要运行任务时，请将其放在队列数据结构中。

当您的程序启动时，也可以根据需要启动多个工作线程。安排每个线程在需要执行某些操作时从队列中执行阻塞读取。因此，当队列为空或几乎为空时，大多数线程将被阻塞，等待某些内容进入队列。

当队列中有大量任务时，每个线程将从队列中拉出一个任务并执行它。完成后，它将执行另一项任务并执行该任务。当然，这意味着任务将以与启动时不同的顺序完成。大概这是可以接受的。

这远远优于您必须等待所有线程完成任务才能获得其他任务的策略。如果长时间运行的任务在您的系统中相对较少，您可能会发现您不必进行更多优化。如果长时间运行的任务很常见，您可能希望为短期和长期运行的任务分别设置队列和单独的线程，这样短期运行的任务就不会被长时间运行的任务所匮乏。

这里有一个危险：如果你的一些任务非常长时间运行（也就是说，它们永远不会因为错误而完成），你最终会中毒所有线程，系统将停止工作。

Answer 3

您希望使用空间填充曲线来细分任务。 sfc将2d复杂度降低到1d复杂度。

改进线程调度策略

背景

我需要帮助的部分

ParallelScheduler伪代码

3 个答案: