我有一个相当典型的生产者 - 消费者场景,我有一个生产者线程执行查询并将结果放到BlockingQueue
和大约7-8个消费者从BlockingQueue
中选择这些对象对它们进行持久的分析。完成这些分析后,生成的对象将放在HashMap
中,原始对象为键,即 HashMap<AnalyzedObject, AnalysisResult>
由于底层数据模型中关系的性质,我得到了很多重复的任务,显然不需要重新处理。我目前的解决方案基本上如下:
public class AnalysisAction implements Runnable{
private Dataset data;
private DbManager dbManager;
private Path path;
private Set<Integer> identifiedElements;
private AnalysisResult res;
private Map<Path, AnalysisResult> analyzedPaths;
public static final AtomicInteger duplicates = new AtomicInteger(0);
public AnalysisAction(Path p, Dataset ds, DbManager dbm, Map<Path, AnalysisResult> paths){
this.data = ds;
this.path = p;
this.dbManager = dbm;
this.analyzedPaths = paths;
this.res = new AnalysisResult(path);
}
@Override
public void run() {
if(!analyzedPaths.containsKey(path)){
t0 = System.currentTimeMillis();
// 1. Check the coverage of the path
this.identifiedElements = getIdentifiedElements();
if(identifiedElements.size() != 0)
{
try{
// TIME CONSUMING STUFF...
analyzedPaths.put(path, res);
}
catch(Exception e){
// Exception handling...
}
}
t_end = System.currentTimeMillis();
DebugToolbox.submitProcTime(t_end - t0);
}
else {
duplicates.incrementAndGet();
logger.finer("Duplicate path encountered..." + System.lineSeparator());
}
}
// PRIVATE METHODS THAT CARRY OUT THE TIME CONSUMING STUFF...
}
然后在控制多线程的类中,我有以下解决方案:
public class ConcurrencyService {
private final ThreadPoolExecutor pool;
private final int poolSize;
private final int qCapacity = 1 << 7;
private final long timeout = 3;
private final Path tainedPath =
new Path(Long.MIN_VALUE, "LAST_PATH_IN_QUEUE", "N/A", "N/A");
private BlockingQueue<PathwayImpl> bq;
private DbManager dbMan;
private Dataset ds;
private Map<Path,AnalysisResult> analyzedPaths;
private volatile boolean started;
public ConcurrencyService(Dataset data, DbManager db){
this.ds = data;
this.bq = new LinkedBlockingQueue<Path>(qCapacity);
this.dbMan = db;
this.analyzedPaths = new ConcurrentHashMap<Path,AnalysisResult>(1<<15);
this.started = false;
poolSize = Runtime.getRuntime().availableProcessors();
pool = (ThreadPoolExecutor) Executors.newFixedThreadPool(poolSize, new FThreadFactory(-1));
}
public void serve() throws InterruptedException {
try {
ds.finalize();
started = true;
Thread producerThread = new Thread(new QueryingAction(), "f-query-thread");
producerThread.start();
Thread loggerThread = new Thread(new PeriodicLogAction(null), "f-logger-thread");
loggerThread.start();
while((producerThread.getState() != Thread.State.TERMINATED) || !bq.isEmpty()){
Path p = bq.poll(timeout, TimeUnit.MINUTES);
if(p != null){
if (p.equals(tainedPath)) break;
pool.submit(new AnalysisAction(p, ds, dbMan, analyzedPaths));
}else
logger.warning("Timed out while waiting for a path...");
}
} catch (Exception ex) {
// Exception handling...
} finally{
pool.shutdown();
long totalTasks = pool.getTaskCount(),
compTasks = pool.getCompletedTaskCount(),
tasksRemaining = totalTasks - compTasks,
timeout = 10 * tasksRemaining / poolSize;
pool.awaitTermination(timeout, TimeUnit.SECONDS);
logger.info(
"A total of " + DebugToolbox.getNbrProcTimes()
+ " tasks analyzed. Mean process time is: "
+ DebugToolbox.getMeanProcTimeAsString()
+ " milliseconds." + System.lineSeparator());
}
public boolean isDone(){
if(this.started)
return pool.isTerminated();
else
return false;
}
}
protected class QueryingAction implements Runnable {
// Use this to limit the number of paths to be analyzed
// private final int debugLimiter = 1500;
private final int debugLimiter = Integer.MAX_VALUE;
public void run() {
try {
int i = 0;
outer: for(String el : ds.getElements()){
inner: for(Path path : dbMan.getAllPathsWithElement(el)){
if(i++ > debugLimiter)
break outer;
else
bq.put(path);
}
}
logger.info("Total number of queried paths: " + i);
} catch (SQLException e) {
// Exception handling...
} catch (InterruptedException e) {
// Exception handling...
}
bq.offer(tainedPath);
}
}
protected class PeriodicLogAction implements Runnable {
private final PrintStream ps;
private final long period;
private final static long DEF_PERIOD = 30000;
private final String nL = System.getProperty("line.separator");
private volatile boolean loop;
private int counter = 0;
private ConcurrencyService cs;
private int inQueryQueue, inPoolQueue,
completedTasks, inProccessedSet,duplicates;
boolean sanityCheck;
StringBuffer sb;
PeriodicLogAction(PrintStream ps, long timePeriod) {
this.ps = ps;
this.period = timePeriod;
this.loop = true;
this.cs = ConcurrencyService.this;
}
// Alternative constructors
@SuppressWarnings("rawtypes")
public void run() {
logger.config("PeriodicLogAction started on thread: " +
Thread.currentThread().getName() +
System.lineSeparator());
while(loop){
// log # of paths created, analyzed and are in queue
outputLogInfo();
// wait designated time period
try {
Thread.sleep(period);
} catch (InterruptedException e) {}
if(cs.isDone()){
this.loop = false;
outputLogInfo();
}
}
}
private void outputLogInfo(){
synchronized (pool) {
Queue queryQueue = cs.bq,
poolQueue = cs.pool.getQueue();
Map<PathwayImpl,AnalysisResult> processedSet = cs.analyzedPaths;
inQueryQueue = queryQueue.size();
inPoolQueue = poolQueue.size();
completedTasks = (int) pool.getCompletedTaskCount();
inProccessedSet = processedSet.size();
duplicates = AnalysisAction.duplicates.get();
sanityCheck = (completedTasks == inProccessedSet + duplicates);
}
sb = new StringBuffer();
sb.append("Checkpoint ").append(++counter).append(": ")
.append("QQ: ").append(inQueryQueue).append("\t")
.append("PQ: ").append(inPoolQueue).append("\t")
.append("CT: ").append(completedTasks).append("\t")
.append("AP: ").append(inProccessedSet).append("\t")
.append("DP: ").append(duplicates).append("\t")
.append("Sanity: ").append(sanityCheck);
if(ps == null)
logger.info(sb.toString() + nL);
else
ps.println(sb.toString());
}
}
}
这是我在日志中看到的内容:
Sep 17, 2014 5:30:00 PM main.ConcurrencyService$QueryingAction run
INFO: Total number of queried paths: 81128
Sep 17, 2014 5:30:00 PM main.ConcurrencyService serve
INFO: All paths are queried and queued...
Initiating a timely shutdown of the pool..
...
Sep 17, 2014 5:49:49 PM main.ConcurrencyService serve
INFO: A total of 8620 tasks analyzed. Mean process time is: 1108.208 milliseconds.
...
Sep 17, 2014 5:49:54 PM main.ConcurrencyService$PeriodicLogAction outputLogInfo
INFO: Checkpoint 41: QQ: 0 PQ: 0 CT: 81128 AP: 8565 DP: 72508 Sanity: false
...表示:
已完成任务的数量与查询和排队的对象数量一致。所以没有错过..
分析路径的数量(因此结果)和重复数量不等于已完成任务的数量:81128 - (8565 + 72508)= 55
累积的结果数与AnalysisAction
类报告的处理时间不匹配:8565 vs 8620(即缺少55个结果)
不确定这种差异的原因是什么,或者从哪里开始调试。我显然无法通过81128任务来调查哪55个缺失,以及为什么......
有什么建议吗?
编辑:以下是对评论中的问题的一些澄清
DebugToolbox.submitProcTimes(long t)
是一个同步静态方法,它只是将t添加到ArrayList
。
isDone()
是ConcurrencyService中的一个方法,当我试图缩短我在这里发布的代码时,我不小心删除了它。我编辑了代码以反映该方法的实现方式。
答案 0 :(得分:2)
您检查地图是否存在该密钥,然后花时间生成该值,然后将该值放入地图中。
在生成值时,另一个线程可以处理相同的密钥。由于它尚未添加,您现在有两个线程生成相同的值。因此,生成的值的数量大于地图的最终大小。
解决方案是添加结果(可能是占位符)并使用putIfAbsent()
以原子方式检查密钥的存在。