多线程执行,保留已完成工作项的顺序

时间:2016-04-05 17:56:48

标签: java multithreading performance concurrency java.util.concurrent

我有一系列工作单元,让我们称之为“工作项目”,按顺序处理(现在)。我想通过多线程工作加快处理速度。

约束:这些工作项按特定顺序排列,在处理订单时不相关 - 但一旦处理完成,订单必须恢复。

这样的事情:

   |.|
   |.|
   |4|
   |3|
   |2|    <- incoming queue
   |1|
  / | \
 2  1  3  <- worker threads
  \ | /
   |3|
   |2|    <- outgoing queue
   |1|

我想在Java中解决这个问题,最好不要使用Executor Services,Futures等,但使用基本的并发方法,如wait(),notify()等。

原因是:我的工作项目非常小且细粒度,它们在每个约0.2毫秒内完成处理。所以我担心使用来自java.util.concurrent。*的东西可能会引入很多开销并减慢我的代码。

到目前为止我发现的例子都保留了处理过程中的顺序(这与我的情况无关),并且在处理后并不关心顺序(这在我的情况下是至关重要的)。

8 个答案:

答案 0 :(得分:4)

如果允许BlockingQueue,为什么要忽略java中的其余并发工具? 你可以用例如Stream(如果你有java 1.8)以上内容:

List<Type> data = ...;
List<Other> out = data.parallelStream()
    .map(t -> doSomeWork(t))
    .collect(Collectors.toList());

由于您是从有序的CollectionList)开始,并且还收集到List,因此您的结果将与输入的顺序相同。

答案 1 :(得分:4)

这就是我在以前的项目中解决你的问题的方法(但使用 java.util.concurrent):

(1)WorkItem类完成实际的工作/处理:

public class WorkItem implements Callable<WorkItem> {
    Object content;
    public WorkItem(Object content) {
        super();
        this.content = content;
    }

    public WorkItem call() throws Exception {
        // getContent() + do your processing
        return this;
    }
}

(2)此类将工作项放入队列并启动处理:

public class Producer {
    ...
    public Producer() {
        super();
        workerQueue = new ArrayBlockingQueue<Future<WorkItem>>(THREADS_TO_USE);
        completionService = new ExecutorCompletionService<WorkItem>(Executors.newFixedThreadPool(THREADS_TO_USE));
        workerThread = new Thread(new Worker(workerQueue));
        workerThread.start();
    }

    public void send(Object o) throws Exception {
        WorkItem workItem = new WorkItem(o);
        Future<WorkItem> future = completionService.submit(workItem);
        workerQueue.put(future);
    }
}

(3)处理完成后,工作项目在此处出列:

public class Worker implements Runnable {
    private ArrayBlockingQueue<Future<WorkItem>> workerQueue = null;

    public Worker(ArrayBlockingQueue<Future<WorkItem>> workerQueue) {
        super();
        this.workerQueue = workerQueue;
    }

    public void run() {
        while (true) {
            Future<WorkItem> fwi = workerQueue.take(); // deqeueue it
            fwi.get(); // wait for it till it has finished processing
        }
    }
}

(4)这就是你在代码中使用这些东西并提交新作品的方法:

public class MainApp {
    public static void main(String[] args) throws Exception {
        Producer p = new Producer();
        for (int i = 0; i < 10000; i++)
            p.send(i);
    }
}

答案 2 :(得分:4)

只需对每个要处理的对象进行ID,创建一个可以接受完成工作的代理,并且只有在按顺序推送ID时才允许返回它。下面的示例代码。注意它是多么简单,使用不同步的自动排序集合,只有2个简单的方法作为API。

public class SequentialPushingProxy {

    static class OrderedJob implements Comparable<OrderedJob>{
        static AtomicInteger idSource = new AtomicInteger();
        int id;

        public OrderedJob() {
            id = idSource.incrementAndGet();
        }

        public int getId() {
            return id;
        }

        @Override
        public int compareTo(OrderedJob o) {
            return Integer.compare(id, o.getId());
        }
    }

    int lastId = OrderedJob.idSource.get();

    public Queue<OrderedJob> queue;

    public SequentialPushingProxy() {
        queue = new PriorityQueue<OrderedJob>();
    }

    public synchronized void pushResult(OrderedJob job) {
        queue.add(job);
    }

    List<OrderedJob> jobsToReturn = new ArrayList<OrderedJob>();
    public synchronized List<OrderedJob> getFinishedJobs() {
        while (queue.peek() != null) {
            // only one consumer at a time, will be safe
            if (queue.peek().getId() == lastId+1) {
                jobsToReturn.add(queue.poll());
                lastId++;
            } else {
                break;
            }
        }
        if (jobsToReturn.size() != 0) {
            List<OrderedJob> toRet = jobsToReturn;
            jobsToReturn = new ArrayList<OrderedJob>();
            return toRet;
        }
        return Collections.emptyList();
    }

    public static void main(String[] args) {
        final SequentialPushingProxy proxy = new SequentialPushingProxy();

        int numProducerThreads = 5;

        for (int i=0; i<numProducerThreads; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    while(true) {
                        proxy.pushResult(new OrderedJob());
                    }
                }
            }).start();
        }


        int numConsumerThreads = 1;

        for (int i=0; i<numConsumerThreads; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    while(true) {
                        List<OrderedJob> ret = proxy.getFinishedJobs();
                        System.out.println("got "+ret.size()+" finished jobs");
                        try {
                            Thread.sleep(200);
                        } catch (InterruptedException e) {
                            // TODO Auto-generated catch block
                            e.printStackTrace();
                        }
                    }
                }
            }).start();
        }


        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        System.exit(0);
    }

}

此代码可以轻松改进

  • 允许一次推送多个作业结果,以降低同步成本
  • 为返回的集合引入限制,以便在较小的块中完成作业
  • 为这两个公共方法提取接口并切换实现以执行测试

答案 3 :(得分:3)

通过BlockingQueue抽取所有期货。这是您需要的所有代码:

public class SequentialProcessor implements Consumer<Task> {
    private final ExecutorService executor = Executors.newCachedThreadPool();
    private final BlockingDeque<Future<Result>> queue = new LinkedBlockingDeque<>();

    public SequentialProcessor(Consumer<Result> listener) {
        new Thread(() -> {
            while (true) {
                try {
                    listener.accept(queue.take().get());
                } catch (InterruptedException | ExecutionException e) {
                    // handle the exception however you want, perhaps just logging it
                }
            }
        }).start();
    }

    public void accept(Task task) {
        queue.add(executor.submit(callableFromTask(task)));
    }

    private Callable<Result> callableFromTask(Task task) {
        return <how to create a Result from a Task>; // implement this however
    }
}

然后使用,创建一个SequentialProcessor(一次):

SequentialProcessor processor = new SequentialProcessor(whatToDoWithResults);

并将任务抽给它:

Stream<Task> tasks; // given this

tasks.forEach(processor); // simply this

我创建了callableFromTask()方法以进行说明,但是如果通过使用lambda而不是方法引用来简单地从Result获取Task,则可以省略它。

例如,如果Task采用getResult()方法,请执行以下操作:

queue.add(executor.submit(task::getResult));

或者如果你需要一个表达式(lambda):

queue.add(executor.submit(() -> task.getValue() + "foo")); // or whatever

答案 4 :(得分:3)

您可以拥有3个输入和3个输出队列 - 每个工作线程的每种类型之一。

现在,当您想要将某些内容插入输入队列时,只需将其放入3个输入队列中的一个。您以循环方式更改输入队列。这同样适用于输出,当你想从输出中选择第一个输出队列的东西时,一旦你得到你的元素,你就切换到下一个队列。

所有队列都需要阻止。

答案 5 :(得分:1)

反应式编程可能有所帮助。在我与RxJava的简短经历中,我发现它比直接等核心语言功能更直观,更易于使用。您的里程可能会有所不同。以下是一些有用的起点https://www.youtube.com/watch?v=_t06LRX0DV0

附带的例子也说明了如何做到这一点。在下面的示例中,我们有需要处理的数据包。它们通过简单的信息传递并最终合并为一个列表。此消息附加的输出显示数据包是在不同的时间点接收和转换的,但最后它们按照接收的顺序输出

import static java.time.Instant.now;
import static rx.schedulers.Schedulers.io;

import java.time.Instant;
import java.util.List;
import java.util.Random;

import rx.Observable;
import rx.Subscriber;

public class RxApp {

  public static void main(String... args) throws InterruptedException {

    List<ProcessedPacket> processedPackets = Observable.range(0, 10) //
        .flatMap(i -> {
          return getPacket(i).subscribeOn(io());
        }) //
        .map(Packet::transform) //
        .toSortedList() //
        .toBlocking() //
        .single();

    System.out.println("===== RESULTS =====");
    processedPackets.stream().forEach(System.out::println);
  }

  static Observable<Packet> getPacket(Integer i) {
    return Observable.create((Subscriber<? super Packet> s) -> {
      // simulate latency
      try {
        Thread.sleep(new Random().nextInt(5000));
      } catch (Exception e) {
        e.printStackTrace();
      }
      System.out.println("packet requested for " + i);
      s.onNext(new Packet(i.toString(), now()));
      s.onCompleted();
    });
  }

}


class Packet {
  String aString;
  Instant createdOn;

  public Packet(String aString, Instant time) {
    this.aString = aString;
    this.createdOn = time;
  }

  public ProcessedPacket transform() {
    System.out.println("                          Packet being transformed " + aString);
    try {
      Thread.sleep(new Random().nextInt(5000));
    } catch (Exception e) {
      e.printStackTrace();
    }
    ProcessedPacket newPacket = new ProcessedPacket(this, now());
    return newPacket;
  }

  @Override
  public String toString() {
    return "Packet [aString=" + aString + ", createdOn=" + createdOn + "]";
  }
}


class ProcessedPacket implements Comparable<ProcessedPacket> {
  Packet p;
  Instant processedOn;

  public ProcessedPacket(Packet p, Instant now) {
    this.p = p;
    this.processedOn = now;
  }

  @Override
  public int compareTo(ProcessedPacket o) {
    return p.createdOn.compareTo(o.p.createdOn);
  }

  @Override
  public String toString() {
    return "ProcessedPacket [p=" + p + ", processedOn=" + processedOn + "]";
  }

}

<强>解构

Observable.range(0, 10) //
    .flatMap(i -> {
      return getPacket(i).subscribeOn(io());
    }) // source the input as observables on multiple threads


    .map(Packet::transform) // processing the input data 

    .toSortedList() // sorting to sequence the processed inputs; 
    .toBlocking() //
    .single();

在一个特定的运行中,按照2,6,0,1,8,7,5,9,4,3的顺序收到数据包,并按2,6,0,1,3,4,5的顺序处理, 7,8,9在不同的线程上

packet requested for 2
                          Packet being transformed 2
packet requested for 6
                          Packet being transformed 6
packet requested for 0
packet requested for 1
                          Packet being transformed 0
packet requested for 8
packet requested for 7
packet requested for 5
packet requested for 9
                          Packet being transformed 1
packet requested for 4
packet requested for 3
                          Packet being transformed 3
                          Packet being transformed 4
                          Packet being transformed 5
                          Packet being transformed 7
                          Packet being transformed 8
                          Packet being transformed 9
===== RESULTS =====
ProcessedPacket [p=Packet [aString=2, createdOn=2016-04-14T13:48:52.060Z], processedOn=2016-04-14T13:48:53.247Z]
ProcessedPacket [p=Packet [aString=6, createdOn=2016-04-14T13:48:52.130Z], processedOn=2016-04-14T13:48:54.208Z]
ProcessedPacket [p=Packet [aString=0, createdOn=2016-04-14T13:48:53.989Z], processedOn=2016-04-14T13:48:55.786Z]
ProcessedPacket [p=Packet [aString=1, createdOn=2016-04-14T13:48:54.109Z], processedOn=2016-04-14T13:48:57.877Z]
ProcessedPacket [p=Packet [aString=8, createdOn=2016-04-14T13:48:54.418Z], processedOn=2016-04-14T13:49:14.108Z]
ProcessedPacket [p=Packet [aString=7, createdOn=2016-04-14T13:48:54.600Z], processedOn=2016-04-14T13:49:11.338Z]
ProcessedPacket [p=Packet [aString=5, createdOn=2016-04-14T13:48:54.705Z], processedOn=2016-04-14T13:49:06.711Z]
ProcessedPacket [p=Packet [aString=9, createdOn=2016-04-14T13:48:55.227Z], processedOn=2016-04-14T13:49:16.927Z]
ProcessedPacket [p=Packet [aString=4, createdOn=2016-04-14T13:48:56.381Z], processedOn=2016-04-14T13:49:02.161Z]
ProcessedPacket [p=Packet [aString=3, createdOn=2016-04-14T13:48:56.566Z], processedOn=2016-04-14T13:49:00.557Z]

答案 6 :(得分:0)

您可以为每个WorkItem启动一个DoTask线程。这个线程处理工作。 完成工作后,您尝试发布在控制对象上同步的项目,在该项目中检查它是否是正确的ID,如果没有则等待。

后期实施可以是:

synchronized(controllingObject) {
try {
while(workItem.id != nextId) controllingObject.wait();
} catch (Exception e) {}
//Post the workItem
nextId++;
object.notifyAll();
}

答案 7 :(得分:0)

预处理:为每个项目添加一个订单值,如果未分配,则准备一个数组。

输入:队列(顺序值为1,2,3,4的并发采样,但对于哪个样本获取哪个样本无关紧要)

输出:数组(写入索引元素,使用同步点等待最后的所有线程,不需要冲突检查,因为它为每个线程写入不同的位置)

后处理:将数组转换为队列。

需要n个线程的n元素数组。或者n的一些倍数只进行一次后处理。