在python中应用groupwise函数

时间:2017-05-13 22:11:07

标签: python pandas numpy

如何将一个函数分组应用于pandas数据框;该函数应用于子组但子组是否在不同的父组之间重复?

示例:

     <!doctype html>
        <html>
        <head>
        <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
        <script>
        var num=0;
        var numOptions = new Array(100);
        window.onload = function() {
            if (window.jQuery) {  
                // jQuery is loaded  
                alert("Yeah!");
            } else {
                // jQuery is not loaded
                alert("Doesn't Work");
            }
        }
        $(document).ready(function(){
            $("#numQuestions").on('input',function(){
            var numbQuestions = $("#numQuestions".text());
            if(num>numbQuestions){
                for(i=numbQuestions;i<=num;i++){
                    try{
                        $("#qRowNum'+i).remove();
                        }catch(err){

                        }
                    }
                    }else{
                    for ( i=num;  i < numbQuestions;  i++) 
                    { 
                        var row = '<div id="qRowNum'+ i '">   

#the below function is not implemented in this version 
<input type="text" placeholder="Question '+i'">    <input type="range" name="numOptions'+i'" min="0" max="5" placeholder="Number Of Options" onchange="CreateOptions(this);" onkeyup="this.onchange();" onpaste="this.onchange();" oninput="this.onchange();> </div>';
                        $("#questionRows").append(row);
                        //New script test


                    }
                }
                num = numbQuestions;
            });
        });
                           <div id="questionRows">

            </div>
            <input type="submit" value="Start">
        </form>

            </body>
        </html> 

预期产出:

| Parent Group | Child Group | Value |
--------------------------------------
|  A           | I1          | V1 |
-----------------------------------
|  A           | I1          | V2 |
-----------------------------------
|  A           | I2          | V3 |
-----------------------------------
|  A           | I2          | V4 |
-----------------------------------
|  B           | I1          | V5 |
-----------------------------------
|  B           | I1          | V6 |
-----------------------------------
|  B           | I2          | V7 |
-----------------------------------
|  B           | I2          | V8 |
-----------------------------------

我可以通过将父组密钥与子组密钥组合来使子组唯一,例如[&#39; A_I1&#39;,&#39; A_I2&#39;]然后应用该功能:

| Parent Group | Child Group | Value     |
------------------------------------------
|  A           | I1          | f(V1, V2) |
------------------------------------------
|  A           | I2          | f(V3, V4) |
------------------------------------------
|  B           | I1          | f(V5, V6) |
------------------------------------------
|  B           | I2          | f(V7, V8) |
------------------------------------------

但我想知道是否有更优雅的方法?

2 个答案:

答案 0 :(得分:0)

你可以这样做:

df.groupby(['Parent Group', 'Child Group'])['Value'].apply(lambda x: ', '.join(x))

输出:

              Parent Group  Child Group
A             I1             V1, V2
              I2             V3, V4
B             I1             V5, V6
              I2             V7, V8

如果要使用任何字符串格式来更改输出值,可以这样做:

df.groupby(['Parent Group', 'Child Group'])['Value'].apply(lambda x: "f(%s)" % ', '.join(x))

输出:

              Parent Group  Child Group
A             I1             f(V1, V2)
              I2             f(V3, V4)
B             I1             f(V5, V6)
              I2             f(V7, V8)

答案 1 :(得分:0)

假设:每组总共有2行。

<强>设置

df = pd.DataFrame({'Child Group': {0: 'I1', 1: 'I1',  2: 'I2',  3: 'I2',  4: 'I1',  5: 'I1',  6: 'I2',  7: 'I2'}, 'Parent Group': {0: 'A',  1: 'A',  2: 'A',  3: 'A',  4: 'B',  5: 'B',  6: 'B',  7: 'B'}, 'Value': {0: 'V1', 1: 'V2',  2: 'V3',  3: 'V4',  4: 'V5',  5: 'V6',  6: 'V7',  7: 'V8'}})

Out[1305]: 
  Child Group Parent Group Value
0          I1            A    V1
1          I1            A    V2
2          I2            A    V3
3          I2            A    V4
4          I1            B    V5
5          I1            B    V6
6          I2            B    V7
7          I2            B    V8

<强>演示

def func(x,y):
    return x+y

#group by Parent Group and Child group, the first value can be reference by x.iloc[0]['Value'] 
#and the second value can be referenced by x.iloc[-1]['Value']. 
#Below is an example to call a function to concatenate the two values.
df.groupby(['Parent Group','Child Group']).apply(lambda x: func(x.iloc[0]['Value'],x.iloc[-1]['Value']))
Out[1304]: 
Parent Group  Child Group
A             I1             V1V2
              I2             V3V4
B             I1             V5V6
              I2             V7V8