Question

我有一个像这样的数据框（df1）

Position    Available   Minimum
Position 1         3    2
Position 2         1    1
Position 3         7    5
Position 4        12    8
Position 5        24    17
Position 6         7    5
Position 7        18    13
Position 8        10    7
Position 9        25    18

我还有另一个像这样的数据框（df2）

Candidate Choice1    Choice2     Choice3     Choice4      Score 
Name 1   Position 2 Position 4  Position 6  Position 9     62
Name 2   Position 8 Position 2  Position 6  Position 5     70
Name 3   Position 5 Position 4  Position 1  Position 6     42
Name 4   Position 8 Position 9  Position 5  Position 2     20
Name 5   Position 6 Position 1  Position 1  Position 1     6
Name 6   Position 4 Position 7  Position 2  Position 4     7
Name 7   Position 1 Position 3  Position 8  Position 6    56
.            .      .           .           .             .
.            .      .           .           .             .
Name n  Position 6  Position 6  Position 4  Position 5     8

现在，我想根据他们的得分和他们的选择将候选人分配到一个位置。如果候选人没有得到选择1那么我们必须寻找选择-2并分配。

示例如下，

Name    Choice1     Choice2      Choice3    Choice4   Score Assigned in
Name 2  Position 2  Position 8  Position 6  Position 5  70  Position 2
Name 1  Position 2  Position 4  Position 6  Position 9  62  Position 4
Name 7  Position 1  Position 3  Position 8  Position 6  56  Position 1
.   .   .   .   .   .   .
.   .   .   .   .   .   .

条件：

最高得分手将获得第一个偏好
如果首选订单不可用，请将他/她分配到某个可用位置
在df1中，我们有一个名为“Minimum”的列，占该特定位置总可用座位的70％。我们需要填补这些职位的至少这么多席位。（如果候选人总数少于可用座位总数的70％，我们可以忽略）

我不知道如何在R中开始使用这个逻辑。非常感谢任何帮助！

Answer 1

您可以创建作业和位置向量

assignments <- rep(NA, n) # n ... candidates
positions <- rep(0, m) # m ... positions

并循环遍历df2，按分数排序（以下代码未经测试）：编辑：减少，多个选择的例子

for (i in order(df2[,"Score"], decreasing=TRUE)) {
    choice <- df[i, "Choice1"]
    if(positions[[choice]]<df1[choice, "Available"]) {
        assignments[[i]] <- choice
        positions[[choice]] <- positions[[choice]]+1
        next # move to next candidate
    }
    choice <- df[i, "Choice2"]
    if(positions[[choice]]<df1[choice, "Available"]) {
        assignments[[i]] <- choice
        positions[[choice]] <- positions[[choice]]+1
        next
    }
    # check choice3, ..., choice4, handle case that all choices are
    # not available
}

但这并未考虑最低要求。为此，将问题表述为线性优化问题并使用lpSolve或lpSolveAPI等包解决问题可能是个好主意。这是一个制剂尝试（再次，未经测试）：

# the variable we are interested in
p_ij ... candidate i at position j (0 or 1)

# auxilliary variables that are optimized by the LP
cf_ik ... candidate i choice k was fullfilled (0 or 1)

# pre-set constants
w_k ... weight for choice k (e.g. w_1=4, ..., w_4=1)
prefs_ijk ... candidate i chose position j as his k th choice
highscore_i ... score for candidate i # may need to rescale this or the w_k
min_pj ... minimum seats
max_pj ... maximum seats

# objective function
obj: max sum_i highscore_i * 
     (w_1 * choice_i1 + w_2 * choice_i2 + w_3 * choice_3 + w_4 * choice_4)

subject to:

# make sure all position are appropriately filled
min_pj <= sum_i (pij) <= max_pj for all j

# one position per candidate
sum_j pij == 1 for all i

# link between choices of candidate choice_ik and assignment
cf_ik = sum_j prefs_ijk * p_ij

这里的一个挑战是将数据帧转换为LP，另一个是解释来自LP求解器的反馈。

根据他们在R中的选择和分数将人员分配到某个职位

1 个答案: