使用ExecutorCompletionService时的同步问题

时间:2014-02-02 14:53:40

标签: java multithreading

我有一个场景

  1. 文本文件每天动态生成。 0到 每天8。每个文件的大小可以从小到大。取决于一天的 数据。
  2. 需要对它们进行一些检查(业务检查,规则)。
  3. 我按照以下方式实施,它的行为不符合预期,似乎我做错了

    对于存储结果我有以下类,1个文件将有1个Result类

    public class Result {
    
        private String fileName;
        private Map<RuleTypes, String> allResult = new HashMap<RuleTypes, String>();
    
            // setter , getter , constructor .. POJO
    }
    

    规则就像

    public class ValidateRule1 implements Rule {
    
        private String fileName;
    
        public String getFileName() {
            return fileName;
        }
    
        public void setFileName(String fileName) {
            this.fileName = fileName;
        }
    
        @Override
        public void init() {
            // TODO Auto-generated method stub
    
        }
    
        @Override
        public void runRule() {
            System.out.println("Start running ... Rule 1  for "+fileName);
            try {
                Random r = new Random();
                int sleepRandomTime = r.nextInt(15-1) + 1;
                Thread.sleep(sleepRandomTime) ; // simulate rule execution
            } catch (InterruptedException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            System.out.println("End running ... Rule 1 for "+fileName);
    
        }
    
        @Override
        public RuleTypes getRuleName() {
            return RuleTypes.Rule1;
        }
    
    }
    

    规则工厂就像

    public static Rule getRule(RuleTypes ruleName) {
            Rule result=null;
    
            switch(ruleName) {
    
                case Rule1 :
                    result = new ValidateRule1(); // todo singleton
                    break;
    
                case Rule2 :
                    result = new ValidateRule2(); // todo singleton
                    break;
    
                case Rule3 :
                    result = new ValidateRule3(); // todo singleton
                    break;
                ...
                }
                }
    

    我按照以下方式调用规则,我使用RuleFactory创建规则(为规则创建单例对象)

    final ConcurrentLinkedQueue<Rule> rulesToExecuteForModel = new ConcurrentLinkedQueue<Rule>();
    rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule1));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule2));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule3));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule4));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule5));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule6));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule7));
            rulesToExecuteForModel.add(RuleFactory.getRule(RuleTypes.Rule8));
    
    
            // pick 1 file and run all rules for it , different threads can pick up different files concurrently ... dont think will need synchronization here 
            List<File> fileQueue = new LinkedList<File>();
            fileQueue.add(new File("../test/files/File1.20140203"));
            fileQueue.add(new File("../test/files/File2.20140203"));
            fileQueue.add(new File("../test/files/File3.20140203"));
            fileQueue.add(new File("../test/files/File4.20140203"));
            fileQueue.add(new File("../test/files/File5.20140203"));
            fileQueue.add(new File("../test/files/File6.20140203"));
    
            // Results Display ... 1 Result obj for 1 File
            ConcurrentLinkedQueue<Result> fileWiseResult = new ConcurrentLinkedQueue<Result>();
            int maxNumOfFiles = fileQueue.size();  
    
            // TODO : how can i exploit the fact that this program runs on 8 core machine ? does 1 thread correspond to 1 CPU ? i kept 8 here because it will run on 8 core machine
            final ExecutorService pool = Executors.newFixedThreadPool(8);
            final ExecutorCompletionService<Result> completionService = new ExecutorCompletionService<Result>(pool);
    
            for (final File file : fileQueue) {
                System.out.println("picked file "+file.getName()+" running ALL rules for it");
                final Future<Result> contentFuture = completionService.submit(new Callable<Result>() {
                    @Override
                    public Result call() throws Exception {
                        Result r = new Result(); // 1 file 1 Result object
                        r.setFileName(file.getName());
                        Iterator<Rule> itr=rulesToExecuteForModel.iterator();
                        // sequentially run different rules for same file
                        while (itr.hasNext()) {
                            Rule currentRule  = itr.next();
                            currentRule.setFileName(file.getName());
                            currentRule.runRule();
                            // take fileName / File as parameter , String result for  currentFile and currentRule
                            r.getFileResult().put(currentRule.getRuleName(), "result for "+currentRule.getRuleName().toString());
                        }
                        return r; 
                    }
                });
            }
    
            for(int i = 0; i <maxNumOfFiles; ++i) {
            Future<Result> future;
            try {
                future = completionService.take();
                Result currentResult=null;
                try {
                    currentResult = future.get();
                } catch (ExecutionException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
                System.out.println("Result for file ["+currentResult.getFileName()+"] is ["+currentResult.getFileResult()+"]");
                fileWiseResult.add(currentResult);
            } catch (InterruptedException e1) {
                e1.printStackTrace();
            }
        }
    

    输出就像

    picked file File1.20140203 running rules for it
    Start running ... Rule 1  for File1.20140203
    End running ... Rule 1 for File1.20140203
    
    E
    Start running ... Rule 2 for File1.20140203
    End running ... Rule 2 for File1.20140203
    End running ... Rule 2 for File1.20140203
    
    Start running ... Rule 3 for File1.20140203
    End running ... Rule 3 for File1.20140203
    
    Start running ... Rule 4 for File1.20140203
    End running ... Rule 4 for File1.20140203
    End running ... Rule 4 for File1.20140203
    
    Start running ... Rule 5 for File1.20140203
    End running ... Rule 5 for File1.20140203
    End running ... Rule 5 for File1.20140203
    End running ... Rule 5 for File1.20140203
    End running ... Rule 5 for File1.20140203
    
    Start running ... Rule 6 for File1.20140203
    End running ... Rule 6 for File1.20140203
    End running ... Rule 6 for File1.20140203
    
    Start running ... Rule 7 for File1.20140203
    End running ... Rule 7 for File1.20140203
    End running ... Rule 7 for File1.20140203
    
    Start running ... Rule 8 for File1.20140203
    End running ... Rule 8 for File1.20140203
    End running ... Rule 8 for File1.20140203
    Result for file [File1.20140203] is [{Rule2=result for Rule2, Rule5=result for Rule5, Rule1=result for Rule1, Rule6=result for Rule6, Rule4=result for Rule4, Rule7=result for Rule7, Rule3=result for Rule3, Rule8=result for Rule8}]
    

    期待一个声明,如“开始运行...规则2 for File1.20140203”和ONE LIKE “结束运行...规则2 for File1。 20140203"

    但是如输出中所见,次数“结束”&gt;次数“开始”

    我也观察

    Start running ... Rule1 for File5.20140203
    Start running ... Rule1 for File6.20140203
    Start running ... Rule1 for File6.20140203
    Start running ... Rule1 for File4.20140203
    Start running ... Rule1 for File5.20140203
    Start running ... Rule1 for File4.20140203
    

    我期待上面的日志消息中有6个唯一的文件名

    第一个问题:我做错了什么?我怎么能纠正它?

    第二个问题(优化..不是实际问题)这个程序将在8核机器上运行....如果我保持池大小为8,那意味着8个线程将并行运行。 ..每个核心......有没有办法可以确保这个?

1 个答案:

答案 0 :(得分:1)

  

但是如输出中所见,次数“结束”&gt;次数“开始”

您的错误就在这一行:

  

currentRule.setFileName(file.getName());

多个线程正在使用相同的规则集合。因此,规则不应具有任何持久状态。您应该使用每个规则方法调用传入文件名。

您应该更改runRule()方法以获取fileName参数,将一个方法作为规则类的字段。

  

这个程序将在8核机器上运行....如果我保持池大小为8,这意味着8个线程将并行运行...每个核心一个......有没有办法可以确保这一点? ?

他们应该是,但没有办法确保它。在OS上运行的其他进程也需要进行维护。它还取决于应用程序中是否发生了多少IO和其他阻塞操作,以确定它们是否并行运行。正确的做法是改变池中的线程数,直到获得最佳的应用程序速度。