我有一系列工作单元,让我们称之为“工作项目”,按顺序处理(现在)。我想通过多线程工作加快处理速度。
约束:这些工作项按特定顺序排列,在处理订单时不相关 - 但一旦处理完成,订单必须恢复。
这样的事情:
|.|
|.|
|4|
|3|
|2| <- incoming queue
|1|
/ | \
2 1 3 <- worker threads
\ | /
|3|
|2| <- outgoing queue
|1|
我想在Java中解决这个问题,最好不要使用Executor Services,Futures等,但使用基本的并发方法,如wait(),notify()等。
原因是:我的工作项目非常小且细粒度,它们在每个约0.2毫秒内完成处理。所以我担心使用来自java.util.concurrent。*的东西可能会引入很多开销并减慢我的代码。
到目前为止我发现的例子都保留了处理过程中的顺序(这与我的情况无关),并且在处理后并不关心顺序(这在我的情况下是至关重要的)。
答案 0 :(得分:4)
如果允许BlockingQueue
,为什么要忽略java中的其余并发工具?
你可以用例如Stream
(如果你有java 1.8)以上内容:
List<Type> data = ...;
List<Other> out = data.parallelStream()
.map(t -> doSomeWork(t))
.collect(Collectors.toList());
由于您是从有序的Collection
(List
)开始,并且还收集到List
,因此您的结果将与输入的顺序相同。
答案 1 :(得分:4)
这就是我在以前的项目中解决你的问题的方法(但使用 java.util.concurrent):
(1)WorkItem类完成实际的工作/处理:
public class WorkItem implements Callable<WorkItem> {
Object content;
public WorkItem(Object content) {
super();
this.content = content;
}
public WorkItem call() throws Exception {
// getContent() + do your processing
return this;
}
}
(2)此类将工作项放入队列并启动处理:
public class Producer {
...
public Producer() {
super();
workerQueue = new ArrayBlockingQueue<Future<WorkItem>>(THREADS_TO_USE);
completionService = new ExecutorCompletionService<WorkItem>(Executors.newFixedThreadPool(THREADS_TO_USE));
workerThread = new Thread(new Worker(workerQueue));
workerThread.start();
}
public void send(Object o) throws Exception {
WorkItem workItem = new WorkItem(o);
Future<WorkItem> future = completionService.submit(workItem);
workerQueue.put(future);
}
}
(3)处理完成后,工作项目在此处出列:
public class Worker implements Runnable {
private ArrayBlockingQueue<Future<WorkItem>> workerQueue = null;
public Worker(ArrayBlockingQueue<Future<WorkItem>> workerQueue) {
super();
this.workerQueue = workerQueue;
}
public void run() {
while (true) {
Future<WorkItem> fwi = workerQueue.take(); // deqeueue it
fwi.get(); // wait for it till it has finished processing
}
}
}
(4)这就是你在代码中使用这些东西并提交新作品的方法:
public class MainApp {
public static void main(String[] args) throws Exception {
Producer p = new Producer();
for (int i = 0; i < 10000; i++)
p.send(i);
}
}
答案 2 :(得分:4)
只需对每个要处理的对象进行ID,创建一个可以接受完成工作的代理,并且只有在按顺序推送ID时才允许返回它。下面的示例代码。注意它是多么简单,使用不同步的自动排序集合,只有2个简单的方法作为API。
public class SequentialPushingProxy {
static class OrderedJob implements Comparable<OrderedJob>{
static AtomicInteger idSource = new AtomicInteger();
int id;
public OrderedJob() {
id = idSource.incrementAndGet();
}
public int getId() {
return id;
}
@Override
public int compareTo(OrderedJob o) {
return Integer.compare(id, o.getId());
}
}
int lastId = OrderedJob.idSource.get();
public Queue<OrderedJob> queue;
public SequentialPushingProxy() {
queue = new PriorityQueue<OrderedJob>();
}
public synchronized void pushResult(OrderedJob job) {
queue.add(job);
}
List<OrderedJob> jobsToReturn = new ArrayList<OrderedJob>();
public synchronized List<OrderedJob> getFinishedJobs() {
while (queue.peek() != null) {
// only one consumer at a time, will be safe
if (queue.peek().getId() == lastId+1) {
jobsToReturn.add(queue.poll());
lastId++;
} else {
break;
}
}
if (jobsToReturn.size() != 0) {
List<OrderedJob> toRet = jobsToReturn;
jobsToReturn = new ArrayList<OrderedJob>();
return toRet;
}
return Collections.emptyList();
}
public static void main(String[] args) {
final SequentialPushingProxy proxy = new SequentialPushingProxy();
int numProducerThreads = 5;
for (int i=0; i<numProducerThreads; i++) {
new Thread(new Runnable() {
@Override
public void run() {
while(true) {
proxy.pushResult(new OrderedJob());
}
}
}).start();
}
int numConsumerThreads = 1;
for (int i=0; i<numConsumerThreads; i++) {
new Thread(new Runnable() {
@Override
public void run() {
while(true) {
List<OrderedJob> ret = proxy.getFinishedJobs();
System.out.println("got "+ret.size()+" finished jobs");
try {
Thread.sleep(200);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}).start();
}
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.exit(0);
}
}
此代码可以轻松改进
答案 3 :(得分:3)
通过BlockingQueue
抽取所有期货。这是您需要的所有代码:
public class SequentialProcessor implements Consumer<Task> {
private final ExecutorService executor = Executors.newCachedThreadPool();
private final BlockingDeque<Future<Result>> queue = new LinkedBlockingDeque<>();
public SequentialProcessor(Consumer<Result> listener) {
new Thread(() -> {
while (true) {
try {
listener.accept(queue.take().get());
} catch (InterruptedException | ExecutionException e) {
// handle the exception however you want, perhaps just logging it
}
}
}).start();
}
public void accept(Task task) {
queue.add(executor.submit(callableFromTask(task)));
}
private Callable<Result> callableFromTask(Task task) {
return <how to create a Result from a Task>; // implement this however
}
}
然后使用,创建一个SequentialProcessor(一次):
SequentialProcessor processor = new SequentialProcessor(whatToDoWithResults);
并将任务抽给它:
Stream<Task> tasks; // given this
tasks.forEach(processor); // simply this
我创建了callableFromTask()
方法以进行说明,但是如果通过使用lambda而不是方法引用来简单地从Result
获取Task
,则可以省略它。
例如,如果Task
采用getResult()
方法,请执行以下操作:
queue.add(executor.submit(task::getResult));
或者如果你需要一个表达式(lambda):
queue.add(executor.submit(() -> task.getValue() + "foo")); // or whatever
答案 4 :(得分:3)
您可以拥有3个输入和3个输出队列 - 每个工作线程的每种类型之一。
现在,当您想要将某些内容插入输入队列时,只需将其放入3个输入队列中的一个。您以循环方式更改输入队列。这同样适用于输出,当你想从输出中选择第一个输出队列的东西时,一旦你得到你的元素,你就切换到下一个队列。
所有队列都需要阻止。
答案 5 :(得分:1)
反应式编程可能有所帮助。在我与RxJava的简短经历中,我发现它比直接等核心语言功能更直观,更易于使用。您的里程可能会有所不同。以下是一些有用的起点https://www.youtube.com/watch?v=_t06LRX0DV0
附带的例子也说明了如何做到这一点。在下面的示例中,我们有需要处理的数据包。它们通过简单的信息传递并最终合并为一个列表。此消息附加的输出显示数据包是在不同的时间点接收和转换的,但最后它们按照接收的顺序输出
import static java.time.Instant.now;
import static rx.schedulers.Schedulers.io;
import java.time.Instant;
import java.util.List;
import java.util.Random;
import rx.Observable;
import rx.Subscriber;
public class RxApp {
public static void main(String... args) throws InterruptedException {
List<ProcessedPacket> processedPackets = Observable.range(0, 10) //
.flatMap(i -> {
return getPacket(i).subscribeOn(io());
}) //
.map(Packet::transform) //
.toSortedList() //
.toBlocking() //
.single();
System.out.println("===== RESULTS =====");
processedPackets.stream().forEach(System.out::println);
}
static Observable<Packet> getPacket(Integer i) {
return Observable.create((Subscriber<? super Packet> s) -> {
// simulate latency
try {
Thread.sleep(new Random().nextInt(5000));
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("packet requested for " + i);
s.onNext(new Packet(i.toString(), now()));
s.onCompleted();
});
}
}
class Packet {
String aString;
Instant createdOn;
public Packet(String aString, Instant time) {
this.aString = aString;
this.createdOn = time;
}
public ProcessedPacket transform() {
System.out.println(" Packet being transformed " + aString);
try {
Thread.sleep(new Random().nextInt(5000));
} catch (Exception e) {
e.printStackTrace();
}
ProcessedPacket newPacket = new ProcessedPacket(this, now());
return newPacket;
}
@Override
public String toString() {
return "Packet [aString=" + aString + ", createdOn=" + createdOn + "]";
}
}
class ProcessedPacket implements Comparable<ProcessedPacket> {
Packet p;
Instant processedOn;
public ProcessedPacket(Packet p, Instant now) {
this.p = p;
this.processedOn = now;
}
@Override
public int compareTo(ProcessedPacket o) {
return p.createdOn.compareTo(o.p.createdOn);
}
@Override
public String toString() {
return "ProcessedPacket [p=" + p + ", processedOn=" + processedOn + "]";
}
}
<强>解构强>
Observable.range(0, 10) //
.flatMap(i -> {
return getPacket(i).subscribeOn(io());
}) // source the input as observables on multiple threads
.map(Packet::transform) // processing the input data
.toSortedList() // sorting to sequence the processed inputs;
.toBlocking() //
.single();
在一个特定的运行中,按照2,6,0,1,8,7,5,9,4,3的顺序收到数据包,并按2,6,0,1,3,4,5的顺序处理, 7,8,9在不同的线程上
packet requested for 2
Packet being transformed 2
packet requested for 6
Packet being transformed 6
packet requested for 0
packet requested for 1
Packet being transformed 0
packet requested for 8
packet requested for 7
packet requested for 5
packet requested for 9
Packet being transformed 1
packet requested for 4
packet requested for 3
Packet being transformed 3
Packet being transformed 4
Packet being transformed 5
Packet being transformed 7
Packet being transformed 8
Packet being transformed 9
===== RESULTS =====
ProcessedPacket [p=Packet [aString=2, createdOn=2016-04-14T13:48:52.060Z], processedOn=2016-04-14T13:48:53.247Z]
ProcessedPacket [p=Packet [aString=6, createdOn=2016-04-14T13:48:52.130Z], processedOn=2016-04-14T13:48:54.208Z]
ProcessedPacket [p=Packet [aString=0, createdOn=2016-04-14T13:48:53.989Z], processedOn=2016-04-14T13:48:55.786Z]
ProcessedPacket [p=Packet [aString=1, createdOn=2016-04-14T13:48:54.109Z], processedOn=2016-04-14T13:48:57.877Z]
ProcessedPacket [p=Packet [aString=8, createdOn=2016-04-14T13:48:54.418Z], processedOn=2016-04-14T13:49:14.108Z]
ProcessedPacket [p=Packet [aString=7, createdOn=2016-04-14T13:48:54.600Z], processedOn=2016-04-14T13:49:11.338Z]
ProcessedPacket [p=Packet [aString=5, createdOn=2016-04-14T13:48:54.705Z], processedOn=2016-04-14T13:49:06.711Z]
ProcessedPacket [p=Packet [aString=9, createdOn=2016-04-14T13:48:55.227Z], processedOn=2016-04-14T13:49:16.927Z]
ProcessedPacket [p=Packet [aString=4, createdOn=2016-04-14T13:48:56.381Z], processedOn=2016-04-14T13:49:02.161Z]
ProcessedPacket [p=Packet [aString=3, createdOn=2016-04-14T13:48:56.566Z], processedOn=2016-04-14T13:49:00.557Z]
答案 6 :(得分:0)
您可以为每个WorkItem启动一个DoTask线程。这个线程处理工作。 完成工作后,您尝试发布在控制对象上同步的项目,在该项目中检查它是否是正确的ID,如果没有则等待。
后期实施可以是:
synchronized(controllingObject) {
try {
while(workItem.id != nextId) controllingObject.wait();
} catch (Exception e) {}
//Post the workItem
nextId++;
object.notifyAll();
}
答案 7 :(得分:0)
预处理:为每个项目添加一个订单值,如果未分配,则准备一个数组。
输入:队列(顺序值为1,2,3,4的并发采样,但对于哪个样本获取哪个样本无关紧要)
输出:数组(写入索引元素,使用同步点等待最后的所有线程,不需要冲突检查,因为它为每个线程写入不同的位置)
后处理:将数组转换为队列。
需要n个线程的n元素数组。或者n的一些倍数只进行一次后处理。