排列|组合|蟒蛇

时间:2018-02-17 07:51:26

标签: python pandas numpy combinations

我有一个如下所示的DataFrame:

$ echo $?
1

是否有一种聪明的方法来创建每个问题的不同组合?

理想输出是这样的:

function onEdit(e) {
var s, targetSheet;
s = e.source.getActiveSheet();

if (s.getName() !== 'Blocket' || e.range.columnStart == 1) return;
s.getRange(e.range.rowStart, 1)
    .setValue(new Date());

if (e.range.columnStart == 2 && e.value == "Såld") {
    targetSheet = e.source.getSheetByName("Sålda");

    if (targetSheet.getLastRow() == targetSheet.getMaxRows()) {
        targetSheet.insertRowsAfter(targetSheet.getLastRow(), 20); //inserts 20 rows 
    }
    s.getRange(e.range.rowStart, 1, 1, s.getLastColumn()).moveTo(targetSheet.getRange(targetSheet.getLastRow() + 1, 1));
}
}

我想给每个组合一个分数,看看哪个是最强的选项

这可以用我的DataFrame的形状吗?

我尝试过以下方法:

Issue       Options     Points
Bonus       10          4000
Bonus       8           3000
Bonus       6           2000
Bonus       4           1000
Bonus       2           0
Assignment  A           0
Assignment  B           -600
Assignment  C           -1200
Assignment  D           -1800
Assignment  E           -2400
Leave       35          1600
Leave       30          1200
Leave       25          800
Leave       20          400
Leave       15          0

但它给了我在这种情况下不可能的所有组合,因为你只能挑选一个问题,而不是多个问题。

1 个答案:

答案 0 :(得分:3)

您可以尝试将不同的问题分成字典,然后获得排列:

>>> df

         Issue Options  Points
0        Bonus      10    4000
1        Bonus       8    3000
2        Bonus       6    2000
3        Bonus       4    1000
4        Bonus       2       0
5   Assignment       A       0
6   Assignment       B    -600
7   Assignment       C   -1200
8   Assignment       D   -1800
9   Assignment       E   -2400
10       Leave      35    1600
11       Leave      30    1200
12       Leave      25     800
13       Leave      20     400
14       Leave      15       0

现在让我们创建一个包含所有可能问题的字典作为键和值,这是一个包含所有可能行的字典:

>>> d = {issue: df[df['Issue']==issue].copy().drop('Issue',
         axis=1).to_dict(orient='records') 
         for issue in df['Issue'].unique()}
>>> d
{'Assignment': [{'Options': 'A', 'Points': 0},
  {'Options': 'B', 'Points': -600},
  {'Options': 'C', 'Points': -1200},
  {'Options': 'D', 'Points': -1800},
  {'Options': 'E', 'Points': -2400}],
 'Bonus': [{'Options': '10', 'Points': 4000},
  {'Options': '8', 'Points': 3000},
  {'Options': '6', 'Points': 2000},
  {'Options': '4', 'Points': 1000},
  {'Options': '2', 'Points': 0}],
 'Leave': [{'Options': '35', 'Points': 1600},
  {'Options': '30', 'Points': 1200},
  {'Options': '25', 'Points': 800},
  {'Options': '20', 'Points': 400},
  {'Options': '15', 'Points': 0}]}

接下来我们可以通过这种方式获得字典之间的所有排列:

>>> from itertools import product
>>> combinations = [dict(zip(d, v)) for v in product(*d.values())]
>>> combinations
[{'Assignment': {'Options': 'A', 'Points': 0},
  'Bonus': {'Options': '10', 'Points': 4000},
  'Leave': {'Options': '35', 'Points': 1600}},
 {'Assignment': {'Options': 'A', 'Points': 0},
  'Bonus': {'Options': '10', 'Points': 4000},
  'Leave': {'Options': '30', 'Points': 1200}},
 {'Assignment': {'Options': 'A', 'Points': 0},...]

对于第一个组合,我们可以获得:

>>> issues = df['Issue'].unique()

>>> issues
array(['Bonus', 'Assignment', 'Leave'], dtype=object)

>>> c1 = ' + '.join([issue + ' %s'%combinations[0][issue]['Options'] 
                     for issue in issues])

>>> c1 
'Bonus 10 + Assignment A + Leave 35'

>>> c2 = ' + '.join([' %s'%combinations[0][issue]['Points'] for issue in issues])

>>> c2
' 4000 +  0 +  1600'

# Eval ' 4000 +  0 +  1600' to obtain the sum

>>> c3 = str(eval(c2))

>>> c3
'5600'

这一切都可以这样加入:

>>> 'Combination_%d: %s'%(0,' = '.join([c1, c2, c3]))
'Combination_0: Bonus 10 + Assignment A + Leave 35 =  4000 +  0 +  1600 = 5600'

我们可以定义一个函数来从组合列表中获取所有字符串:

>>> def get_output(i,combination, issues):                       
        c1 = ' + '.join([issue + ' %s'%combination[issue]['Options']
                         for issue in issues])
        c2 = ' + '.join([' %s'%combination[issue]['Points'] 
                         for issue in issues])
        c3 = str(eval(c2))
        return 'Combination_%d: %s'%(i,' = '.join([c1, c2, c3]))

>>> [get_output(i+1,c, issues) for i, c in enumerate(combinations)]

['Combination_1: Bonus 10 + Assignment A + Leave 35 =  4000 +  0 +  1600 = 5600',
 'Combination_2: Bonus 10 + Assignment A + Leave 30 =  4000 +  0 +  1200 = 5200',
 'Combination_3: Bonus 10 + Assignment A + Leave 25 =  4000 +  0 +  800 = 4800',
 'Combination_4: Bonus 10 + Assignment A + Leave 20 =  4000 +  0 +  400 = 4400',
 'Combination_5: Bonus 10 + Assignment A + Leave 15 =  4000 +  0 +  0 = 4000',
 'Combination_6: Bonus 10 + Assignment B + Leave 35 =  4000 +  -600 +  1600 = 5000',
 'Combination_7: Bonus 10 + Assignment B + Leave 30 =  4000 +  -600 +  1200 = 4600',
 'Combination_8: Bonus 10 + Assignment B + Leave 25 =  4000 +  -600 +  800 = 4200',
 'Combination_9: Bonus 10 + Assignment B + Leave 20 =  4000 +  -600 +  400 = 3800',
 'Combination_10: Bonus 10 + Assignment B + Leave 15 =  4000 +  -600 +  0 = 3400',,...]