带有moveThreadCount = 1的Optaplanner配置与没有moveThreadCount

时间:2018-10-18 17:13:39

标签: multithreading optaplanner

我已升级到Optaplanner 7.12,在寻找与VariableListeners混合使用多线程的潜在问题时,发现可重复执行的过程很奇怪:如果配置文件具有<moveThreadCount>1</moveThreadCount>,则执行与缺少moveThreadCount行,对于用户来说,这似乎对我来说是意料之外的,并且可能与我所看到的潜在optaplanner竞争状况交织在一起(在本文结尾处注明)。

代码详细信息

我在具有固定种子的REPRODUCIBLE模式下观察到此配置文件: <environmentMode>REPRODUCIBLE</environmentMode> <randomSeed>50</randomSeed>

在我使用VariableListener期间,可以看到Optaplanner行为的差异。我为护士排班派生的模型提供了一组自定义MoveFactory类。每个定制工厂都会为每个STEP生成一组不同的动作,并且每个人都会基于一组与状态相关的计算密集型预计算来确定自己的动作。我创建了一个进行预计算的MoveFactoryHelper类,然后在每个自定义MoveFactory的createMoveList方法的开始处调用帮助器(我尚未尝试迁移到更新的Optaplanner迭代移动生成选项)。

为避免为每个移动工厂重复计算,MoveFactoryHelper存储其结果以供重复使用,并基于在(完全上)注册的VariableListener中设置的“脏”标志来决定何时重新计算。未使用)的阴影到模型的PlanningEntity,并在重新计算时由MoveFactoryHelper清除:

ShiftAssignment.java

    @PlanningEntity(movableEntitySelectionFilter = MovableShiftAssignmentSelectionFilter.class,
        difficultyComparatorClass = ShiftAssignmentDifficultyComparator.class)
    @XStreamAlias("ShiftAssignment")
    public class ShiftAssignment extends AbstractPersistable {

        ...

        @PlanningVariable(valueRangeProviderRefs = {"employeeRange"},
            strengthComparatorClass = EmployeeStrengthComparator.class
            )
        private Employee employee;

        ...

        @CustomShadowVariable( variableListenerClass=UpdatingEmployeeVariableListener.class, 
            sources={@PlanningVariableReference(variableName="employee", entityClass=ShiftAssignment.class)})
        private Employee notifierEmployee;  // TODO is there a better way to notify move factory of changes in problem facts?

UpdatingEmployeeVariableListener.java

    private static final Logger logger = LoggerFactory.getLogger(UpdatingEmployeeVariableListener.class);

    private static final boolean initiallyDirty = true;
    private static Map<Thread, Boolean> employeeShiftAssignmentEntityDirty = new HashMap<Thread, Boolean>();
    private static Map<Thread, Boolean> employeeShiftAssignmentMapDirty = new HashMap<Thread, Boolean>();

    private static final boolean useThreadFlags = false;

    // debug monitoring
    private static Map<Thread, Integer> countDirtyAllFlags = new HashMap<Thread, Integer>();
    private static Map<Thread, Integer> countBeforeEntityAdded = new HashMap<Thread, Integer>();
    private static Map<Thread, Integer> countAfterEntityAdded = new HashMap<Thread, Integer>();
    private static Map<Thread, Integer> countBeforeVariableChanged = new HashMap<Thread, Integer>();
    private static Map<Thread, Integer> countAfterVariableChanged = new HashMap<Thread, Integer>();
    private static Map<Thread, Integer> countBeforeEntityRemoved = new HashMap<Thread, Integer>();
    private static Map<Thread, Integer> countAfterEntityRemoved = new HashMap<Thread, Integer>();

    public UpdatingEmployeeVariableListener() {
        // no action
    }

    private static Thread getActiveThread() {
        return useThreadFlags ? Thread.currentThread() : null;
    }

    public static void setFlagsDirty() {
        countDirtyAllFlags.put(getActiveThread(), 1+countDirtyAllFlags.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentEntityDirty.put(getActiveThread(), true);
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    @Override
    public void beforeEntityAdded(@SuppressWarnings("rawtypes") ScoreDirector scoreDirector, ShiftAssignment entity) {
        countBeforeEntityAdded.put(getActiveThread(), 1+countBeforeEntityAdded.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    @Override
    public void afterEntityAdded(@SuppressWarnings("rawtypes") ScoreDirector scoreDirector, ShiftAssignment entity) {
        countAfterEntityAdded.put(getActiveThread(), 1+countAfterEntityAdded.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    @Override
    public void beforeVariableChanged(@SuppressWarnings("rawtypes") ScoreDirector scoreDirector,
            ShiftAssignment entity) {
        countBeforeVariableChanged.put(getActiveThread(), 1+countBeforeVariableChanged.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    @Override
    public void afterVariableChanged(@SuppressWarnings("rawtypes") ScoreDirector scoreDirector,
            ShiftAssignment entity) {
        countAfterVariableChanged.put(getActiveThread(), 1+countAfterVariableChanged.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    @Override
    public void beforeEntityRemoved(@SuppressWarnings("rawtypes") ScoreDirector scoreDirector, ShiftAssignment entity) {
        countBeforeEntityRemoved.put(getActiveThread(), 1+countBeforeEntityRemoved.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    @Override
    public void afterEntityRemoved(@SuppressWarnings("rawtypes") ScoreDirector scoreDirector, ShiftAssignment entity) {
        countAfterEntityRemoved.put(getActiveThread(), 1+countAfterEntityRemoved.getOrDefault(getActiveThread(), 0));
        employeeShiftAssignmentMapDirty.put(getActiveThread(), true);
    }

    /**
     * @return the employeeShiftAssignmentEntityDirty
     */
    public static boolean isEmployeeShiftAssignmentEntityDirty() {
        return employeeShiftAssignmentEntityDirty.getOrDefault(getActiveThread(), initiallyDirty);
    }

    /**
     * clears isEntityDirty, implying that the (externally maintained) employee shift assignment entity list has been updated 
     */
    public static void clearEmployeeShiftAssignmentEntityDirty() {
        employeeShiftAssignmentEntityDirty.put(getActiveThread(), false);       
    }

    /**
     * @return the mapDirty (which is depending also on entityDirty)
     */
    public static boolean isEmployeeShiftAssignmentMapDirty() {
        return employeeShiftAssignmentMapDirty.getOrDefault(getActiveThread(), initiallyDirty) || isEmployeeShiftAssignmentEntityDirty();
    }

    /**
     * clears isMapDirty, implying that the (externally maintained) employee shift assignment map has been updated (as well as the underlying entity) 
     */
    public static void clearEmployeeShiftAssignmentMapDirty() {
        clearEmployeeShiftAssignmentEntityDirty();
        employeeShiftAssignmentMapDirty.put(getActiveThread(), false);
        logger.debug("Clearing dirty flag: (AF={}, BEA={}, AEA={}, BVC={}, AVC={}, BER={}, AER={}) thread={}, employeeShiftAssignmentEntityDirty={}, employeeShiftAssignmentMapDirty={}", 
                countDirtyAllFlags.getOrDefault(getActiveThread(), 0),
                countBeforeEntityAdded.getOrDefault(getActiveThread(), 0),
                countAfterEntityAdded.getOrDefault(getActiveThread(), 0),
                countBeforeVariableChanged.getOrDefault(getActiveThread(), 0),
                countAfterVariableChanged.getOrDefault(getActiveThread(), 0),
                countBeforeEntityRemoved.getOrDefault(getActiveThread(), 0),
                countAfterEntityRemoved.getOrDefault(getActiveThread(), 0),
                getActiveThread(),
                employeeShiftAssignmentEntityDirty, 
                employeeShiftAssignmentMapDirty);
        clearCounts();
    }

    private static void clearCounts() {
        countDirtyAllFlags.put(getActiveThread(), 0);
        countBeforeEntityAdded.put(getActiveThread(), 0);
        countAfterEntityAdded.put(getActiveThread(), 0);
        countBeforeVariableChanged.put(getActiveThread(), 0);
        countAfterVariableChanged.put(getActiveThread(), 0);
        countBeforeEntityRemoved.put(getActiveThread(), 0);
        countAfterEntityRemoved.put(getActiveThread(), 0);
    }
}

(请注意,在这里布尔映射和整数映射实际上是单个布尔和整数,因为由于最后的useThreadFlags=false,映射查找中的线程当前始终为空)

我确认只有MoveFactory对象调用MoveFactoryHelper。同样,除了上面的VariableListener注释和从MoveFactoryHelper查询/清除标志之外,对UpdatingEmployeeVariableListener的唯一其他调用是在解决开始之前调用UpdatingEmployeeVariableListener.setFlagsDirty()

        @Override
        public void actionPerformed(ActionEvent e) {
            UpdatingEmployeeVariableListener.setFlagsDirty();
            setSolvingState(true);
            Solution_ problem = solutionBusiness.getSolution();
            new SolveWorker(problem).execute();
        }

并且在求解停止后:

    solver.terminateEarly();
    UpdatingEmployeeVariableListener.setFlagsDirty();

按线程映射的模板是新的,但是布尔标志的基础使用已经成功执行了多年:

  1. 由于optaplanner对计划实体上的beforeVariableChanged和afterVariableChanged进行了调用,因此这些标志变得肮脏
  2. 第一个MoveFactory调用MoveFactoryHelper,后者调用UpdatingEmployeeVariableListener.isEmployeeShiftAssignmentMapDirty(),其结果为true。 MoveFactoryHelper根据当前状态重新计算,然后调用以清除脏标志
  3. 其余的MoveFactory对象调用MoveFactoryHelper,该对象在is ... Dirty()查询中看到错误,因此可以重新使用其计算。
  4. Optaplanner测试了许多候选动作,这再次使标志变脏,并在为此步骤选择了动作之后,在下一步的早期再次调用MoveFactory.createMoveList方法,重复该循环。

日志详细信息显示Optaplanner的异常行为

在升级到7.12且没有moveThreadCount配置行的情况下,当我没有定义moveThreadCount xml元素时,代码将继续正确且可重复地运行:

11:20:37.274 INFO  Solving started: time spent (422), best score (0hard/-5340soft), environment mode (REPRODUCIBLE), random (JDK with seed 50).
11:20:37.280 DEBUG     CH step (0), time spent (428), score (0hard/-5340soft), selected move count (1), picked move ((NullEmployee-nochange) 2018-12-25/D/0 {...}).
11:20:37.280 INFO  Construction Heuristic phase (0) ended: time spent (428), best score (0hard/-5340soft), score calculation speed (1000/sec), step total (1).

11:20:37.561 DEBUG Clearing dirty flag: (AF=1, BEA=0, AEA=0, BVC=0, AVC=0, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
11:20:44.303 DEBUG     LS step (0), time spent (7451), score (0hard/-4919soft), new best score (0hard/-4919soft), accepted/selected move count (1/300), picked move ([(WeekAlign-f) {...}, (WeekAlign-f) {...}]).
11:20:44.310 DEBUG Factories(10) STEP moves: 1594020

11:20:44.312 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=13800, AVC=13800, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
11:20:46.609 DEBUG     LS step (1), time spent (9757), score (0hard/-5266soft),     best score (0hard/-4919soft), accepted/selected move count (1/24), picked move ((SlidePair) 2019-06-04/1/0... 1 shifts {...} <-slide-> {...} 3 shifts ...2019-06-07/1/0).
11:20:46.610 DEBUG Factories(10) STEP moves: 473969

11:20:46.613 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=746, AVC=746, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
11:20:48.124 DEBUG     LS step (2), time spent (11272), score (0hard/-5083soft),     best score (0hard/-4919soft), accepted/selected move count (1/110), picked move ((CloseSlack-newEmplS) 2019-05-28/D/2(7 shifts) <-swap-> {...} 2019-05-21/D/3(7 shifts)).
11:20:48.124 DEBUG Factories(10) STEP moves: 477083

(每个步骤之后的工厂调试日志行只是为了显示上一步中提供给求解器的10个定制工厂移动了多少步)

但是,当我在配置文件中添加<moveThreadCount>1</moveThreadCount>行时,在Optaplanner中间我看到间歇性的调用以重建MoveFactoryHelper进行变量更改(请参见下面的LS步骤2):

10:46:05.413 INFO  Solving started: time spent (360), best score (0hard/-5340soft), environment mode (REPRODUCIBLE), random (JDK with seed 50).
10:46:05.746 DEBUG     CH step (0), time spent (693), score (0hard/-5340soft), selected move count (1), picked move ((NullEmployee-nochange) 2018-12-25/D/0 {...}).
10:46:05.746 INFO  Construction Heuristic phase (0) ended: time spent (693), best score (0hard/-5340soft), score calculation speed (9/sec), step total (1).

10:46:05.949 DEBUG Clearing dirty flag: (AF=1, BEA=0, AEA=0, BVC=0, AVC=0, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
10:46:13.014 DEBUG     LS step (0), time spent (7961), score (0hard/-4919soft), new best score (0hard/-4919soft), accepted/selected move count (1/300), picked move ([(WeekAlign-f) {...}, (WeekAlign-f) {...}]).
10:46:13.019 DEBUG Factories(10) STEP moves: 1594020

10:46:13.021 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=13844, AVC=13844, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
10:46:14.741 DEBUG     LS step (1), time spent (9688), score (0hard/-5266soft),     best score (0hard/-4919soft), accepted/selected move count (1/19), picked move ((SlidePair) 2019-06-04/1/0... 1 shifts {...} <-slide-> {...} 3 shifts ...2019-06-07/1/0).
10:46:14.741 DEBUG Factories(10) STEP moves: 473969

10:46:14.743 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=582, AVC=582, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
10:46:14.743 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=20, AVC=20, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
10:46:16.444 DEBUG     LS step (2), time spent (11391), score (0hard/-5083soft),     best score (0hard/-4919soft), accepted/selected move count (1/97), picked move ((CloseSlack-newEmplS) {...} 2019-05-28/D/2(7 shifts) <-swap-> {...} 2019-05-21/D/3(7 shifts)).
10:46:16.445 DEBUG Factories(10) STEP moves: 1580032

两条评论: 首先,有一些可重复执行的损失,例如,请注意,原来在变量更改之前/之后有13800,现在有13844。我想这与多线程的“启用”有关,即使仅使用一个线程。

第二,变量更改的数量和“ split”的详细信息,可以看到两次调用清除了脏标志(在重建MoveFactoryHelper之后)在运行之间有所不同,这使我认为这是一个多线程种族问题,例如:

12:16:27.712 INFO  Solving started: time spent (375), best score (0hard/-5340soft), environment mode (REPRODUCIBLE), random (JDK with seed 50).
12:16:28.043 DEBUG     CH step (0), time spent (706), score (0hard/-5340soft), selected move count (1), picked move ((NullEmployee-nochange) 2018-12-25/D/0 {...}).
12:16:28.043 INFO  Construction Heuristic phase (0) ended: time spent (706), best score (0hard/-5340soft), score calculation speed (9/sec), step total (1).

12:16:28.288 DEBUG Clearing dirty flag: (AF=1, BEA=0, AEA=0, BVC=0, AVC=0, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
12:16:35.148 DEBUG     LS step (0), time spent (7811), score (0hard/-4919soft), new best score (0hard/-4919soft), accepted/selected move count (1/300), picked move ([(WeekAlign-f) {...}, (WeekAlign-f) {...}]).
12:16:35.158 DEBUG Factories(10) STEP moves: 1594020

12:16:35.160 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=13821, AVC=13821, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
12:16:35.160 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=0, AVC=0, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
12:16:37.050 DEBUG     LS step (1), time spent (9713), score (0hard/-5266soft),     best score (0hard/-4919soft), accepted/selected move count (1/22), picked move ((SlidePair) 2019-06-04/1/0... 1 shifts {...} <-slide-> {...} 3 shifts ...2019-06-07/1/0).
12:16:37.053 DEBUG Factories(10) STEP moves: 1576812

12:16:37.054 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=763, AVC=763, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
12:16:37.055 DEBUG Clearing dirty flag: (AF=0, BEA=0, AEA=0, BVC=23, AVC=23, BER=0, AER=0) thread=null, employeeShiftAssignmentEntityDirty={null=false}, employeeShiftAssignmentMapDirty={null=false}
12:16:39.414 DEBUG     LS step (2), time spent (12077), score (0hard/-5083soft),     best score (0hard/-4919soft), accepted/selected move count (1/98), picked move ((CloseSlack-newEmplS) {...} 2019-05-28/D/2(7 shifts) <-swap-> {...} 2019-05-21/D/3(7 shifts)).
12:16:39.414 DEBUG Factories(10) STEP moves: 1580534

因此,我有两个问题:

  1. 在没有moveThreadCount定义而不是1的情况下Optaplanner的行为是否正确?对于用户来说,这似乎是意外的。

  2. 我或Optaplanner可能会导致在上一步完成所有Optaplanner的变量更改之前,即使是在单个步骤中,也要尽早调用自定义MoveFactory(以生成Move列表) -thread配置?我想知道是否实现了“选择的移动”并且新的createMoveList调用在“最后的移动”中的最后一个移动得分/测试线程全部暂停之前开始。即使是这样,我也不知道为什么这会导致不可重复执行,除非仍在运行的线程随机选择(这似乎会产生不可重复执行)。

    < / li>

这同时在“运行”和“调试”执行环境中发生。

谢谢。

1 个答案:

答案 0 :(得分:0)

显示为由Optaplanner 7.15.0解决。更新到此版本将解决此问题。