如何将一个函数分组应用于pandas数据框;该函数应用于子组但子组是否在不同的父组之间重复?
示例:
<!doctype html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script>
var num=0;
var numOptions = new Array(100);
window.onload = function() {
if (window.jQuery) {
// jQuery is loaded
alert("Yeah!");
} else {
// jQuery is not loaded
alert("Doesn't Work");
}
}
$(document).ready(function(){
$("#numQuestions").on('input',function(){
var numbQuestions = $("#numQuestions".text());
if(num>numbQuestions){
for(i=numbQuestions;i<=num;i++){
try{
$("#qRowNum'+i).remove();
}catch(err){
}
}
}else{
for ( i=num; i < numbQuestions; i++)
{
var row = '<div id="qRowNum'+ i '">
#the below function is not implemented in this version
<input type="text" placeholder="Question '+i'"> <input type="range" name="numOptions'+i'" min="0" max="5" placeholder="Number Of Options" onchange="CreateOptions(this);" onkeyup="this.onchange();" onpaste="this.onchange();" oninput="this.onchange();> </div>';
$("#questionRows").append(row);
//New script test
}
}
num = numbQuestions;
});
});
<div id="questionRows">
</div>
<input type="submit" value="Start">
</form>
</body>
</html>
预期产出:
| Parent Group | Child Group | Value |
--------------------------------------
| A | I1 | V1 |
-----------------------------------
| A | I1 | V2 |
-----------------------------------
| A | I2 | V3 |
-----------------------------------
| A | I2 | V4 |
-----------------------------------
| B | I1 | V5 |
-----------------------------------
| B | I1 | V6 |
-----------------------------------
| B | I2 | V7 |
-----------------------------------
| B | I2 | V8 |
-----------------------------------
我可以通过将父组密钥与子组密钥组合来使子组唯一,例如[&#39; A_I1&#39;,&#39; A_I2&#39;]然后应用该功能:
| Parent Group | Child Group | Value |
------------------------------------------
| A | I1 | f(V1, V2) |
------------------------------------------
| A | I2 | f(V3, V4) |
------------------------------------------
| B | I1 | f(V5, V6) |
------------------------------------------
| B | I2 | f(V7, V8) |
------------------------------------------
但我想知道是否有更优雅的方法?
答案 0 :(得分:0)
你可以这样做:
df.groupby(['Parent Group', 'Child Group'])['Value'].apply(lambda x: ', '.join(x))
输出:
Parent Group Child Group
A I1 V1, V2
I2 V3, V4
B I1 V5, V6
I2 V7, V8
如果要使用任何字符串格式来更改输出值,可以这样做:
df.groupby(['Parent Group', 'Child Group'])['Value'].apply(lambda x: "f(%s)" % ', '.join(x))
输出:
Parent Group Child Group
A I1 f(V1, V2)
I2 f(V3, V4)
B I1 f(V5, V6)
I2 f(V7, V8)
答案 1 :(得分:0)
假设:每组总共有2行。
<强>设置强>
df = pd.DataFrame({'Child Group': {0: 'I1', 1: 'I1', 2: 'I2', 3: 'I2', 4: 'I1', 5: 'I1', 6: 'I2', 7: 'I2'}, 'Parent Group': {0: 'A', 1: 'A', 2: 'A', 3: 'A', 4: 'B', 5: 'B', 6: 'B', 7: 'B'}, 'Value': {0: 'V1', 1: 'V2', 2: 'V3', 3: 'V4', 4: 'V5', 5: 'V6', 6: 'V7', 7: 'V8'}})
Out[1305]:
Child Group Parent Group Value
0 I1 A V1
1 I1 A V2
2 I2 A V3
3 I2 A V4
4 I1 B V5
5 I1 B V6
6 I2 B V7
7 I2 B V8
<强>演示强>
def func(x,y):
return x+y
#group by Parent Group and Child group, the first value can be reference by x.iloc[0]['Value']
#and the second value can be referenced by x.iloc[-1]['Value'].
#Below is an example to call a function to concatenate the two values.
df.groupby(['Parent Group','Child Group']).apply(lambda x: func(x.iloc[0]['Value'],x.iloc[-1]['Value']))
Out[1304]:
Parent Group Child Group
A I1 V1V2
I2 V3V4
B I1 V5V6
I2 V7V8