我在Pandas中有这样的数据框:
df = pd.DataFrame({
'org': ['A1', 'B1', 'A1', 'B2'],
'DIH': [True, False, True, False],
'Quantity': [10,20,10,20],
'Items': [1, 2, 3, 4]
})
现在我想获取Quantity
的值计数和模态值,但是按Items
的数量加权。
所以我知道我可以做到
df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False)
得到这个:
Quantity Items
20 6
10 4
但是我怎么把它作为一个百分比值,像这样?
Quantity Items
20 60
10 40
答案 0 :(得分:2)
这对我有用
df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False)/df['Items'].sum()*100
答案 1 :(得分:1)
如果它有一些兴趣,这里有一个函数,它将数据帧作为输入并输出加权值计数(标准化或不标准化)。
{
"manifestVersion": 1,
"id": "usButton",
"version": "1.0.56",
"name": "usButton",
"publisher": "Logrocon",
"icons": {
"default": "img/logo.png"
},
"targets": [
{
"id": "Microsoft.VisualStudio.Services"
}
],
"tags": [
"Work Item",
"Work Item control"
],
"files": [
{
"path": "img",
"addressable": true
},
{
"path": "dist",
"addressable": true
},
{
"path": "scripts/main.js",
"contentType": "text/javascript",
"addressable": true
},
{
"path": "info.html",
"addressable": true
},
{
"path": "usButton.html",
"addressable": true
}
],
"categories": [
"Plan and track"
],
"scopes": [
"vso.work_write"
],
"contributions": [
{
"id": "usButton",
"type": "ms.vss-work-web.work-item-form-control",
"targets": [
"ms.vss-work-web.work-item-form"
],
"properties": {
"name": "usButton",
"uri": "usButton.html",
"height": 40,
"inputs": [
{
"id": "FieldAppTestBtn",
"description": "Autocalculate Remaining Work.",
"type": "WorkItemField",
"properties": {
"workItemFieldTypes": ["Double"]
},
"validation": {
"dataType": "String",
"isRequired": true
}
}
]
}
},
{
"id": "info",
"targets": [],
"description": "The content to be displayed in the dialog",
"type": "ms.vss-web.control",
"properties": {
"uri": "info.html"
}
}
]
}
使用问题示例,权重位于def weighted_value_counts(x, *args, **kwargs):
normalize = kwargs.get('normalize', False)
c0 = x.columns[0]
c1 = x.columns[1]
xtmp = x[[c0,c1]].groupby(c0).agg({c1:'sum'}).sort_values(c1,ascending=False)
s = pd.Series(index=xtmp.index, data=xtmp[c1], name=c0)
if normalize:
s = s / x[c1].sum()
return s
列中
您可以通过执行以下操作来获取加权归一化值计数:
Item
答案 2 :(得分:0)
只需在代码中再添加一行:
df2 = df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False)
df2['Items']=(df2['Items']*100)/df2['Items'].sum()
print (df2)
Output :
Items
Quantity
20 60.0
10 40.0
答案 3 :(得分:0)
尝试这一点(一行):
df.groupby('Quantity').agg({'Items': 'sum'}).sort_values('Items', ascending=False).apply(lambda x: 100*x/float(x.sum()))