根据他们在R中的选择和分数将人员分配到某个职位

时间:2015-04-04 10:25:45

标签: r loops

我有一个像这样的数据框(df1)

Position    Available   Minimum
Position 1         3    2
Position 2         1    1
Position 3         7    5
Position 4        12    8
Position 5        24    17
Position 6         7    5
Position 7        18    13
Position 8        10    7
Position 9        25    18

我还有另一个像这样的数据框(df2)

Candidate Choice1    Choice2     Choice3     Choice4      Score 
Name 1   Position 2 Position 4  Position 6  Position 9     62
Name 2   Position 8 Position 2  Position 6  Position 5     70
Name 3   Position 5 Position 4  Position 1  Position 6     42
Name 4   Position 8 Position 9  Position 5  Position 2     20
Name 5   Position 6 Position 1  Position 1  Position 1     6
Name 6   Position 4 Position 7  Position 2  Position 4     7
Name 7   Position 1 Position 3  Position 8  Position 6    56
.            .      .           .           .             .
.            .      .           .           .             .
Name n  Position 6  Position 6  Position 4  Position 5     8

现在,我想根据他们的得分和他们的选择将候选人分配到一个位置。如果候选人没有得到选择1那么我们必须寻找选择-2并分配。

示例如下,

Name    Choice1     Choice2      Choice3    Choice4   Score Assigned in
Name 2  Position 2  Position 8  Position 6  Position 5  70  Position 2
Name 1  Position 2  Position 4  Position 6  Position 9  62  Position 4
Name 7  Position 1  Position 3  Position 8  Position 6  56  Position 1
.   .   .   .   .   .   .
.   .   .   .   .   .   .

条件:

  1. 最高得分手将获得第一个偏好

  2. 如果首选订单不可用,请将他/她分配到某个可用位置

  3. 在df1中,我们有一个名为“Minimum”的列,占该特定位置总可用座位的70%。我们需要填补这些职位的至少这么多席位。 (如果候选人总数少于可用座位总数的70%,我们可以忽略)

  4. 我不知道如何在R中开始使用这个逻辑。非常感谢任何帮助!

1 个答案:

答案 0 :(得分:1)

您可以创建作业和位置向量

assignments <- rep(NA, n) # n ... candidates
positions <- rep(0, m) # m ... positions

并循环遍历df2,按分数排序(以下代码未经测试): 编辑:减少,多个选择的例子

for (i in order(df2[,"Score"], decreasing=TRUE)) {
    choice <- df[i, "Choice1"]
    if(positions[[choice]]<df1[choice, "Available"]) {
        assignments[[i]] <- choice
        positions[[choice]] <- positions[[choice]]+1
        next # move to next candidate
    }
    choice <- df[i, "Choice2"]
    if(positions[[choice]]<df1[choice, "Available"]) {
        assignments[[i]] <- choice
        positions[[choice]] <- positions[[choice]]+1
        next
    }
    # check choice3, ..., choice4, handle case that all choices are
    # not available
}

但这并未考虑最低要求。为此,将问题表述为线性优化问题并使用lpSolvelpSolveAPI等包解决问题可能是个好主意。这是一个制剂尝试(再次,未经测试):

# the variable we are interested in
p_ij ... candidate i at position j (0 or 1)

# auxilliary variables that are optimized by the LP
cf_ik ... candidate i choice k was fullfilled (0 or 1)

# pre-set constants
w_k ... weight for choice k (e.g. w_1=4, ..., w_4=1)
prefs_ijk ... candidate i chose position j as his k th choice
highscore_i ... score for candidate i # may need to rescale this or the w_k
min_pj ... minimum seats
max_pj ... maximum seats

# objective function
obj: max sum_i highscore_i * 
     (w_1 * choice_i1 + w_2 * choice_i2 + w_3 * choice_3 + w_4 * choice_4)

subject to:

# make sure all position are appropriately filled
min_pj <= sum_i (pij) <= max_pj for all j

# one position per candidate
sum_j pij == 1 for all i

# link between choices of candidate choice_ik and assignment
cf_ik = sum_j prefs_ijk * p_ij

这里的一个挑战是将数据帧转换为LP,另一个是解释来自LP求解器的反馈。