我有这样的df
<form class="contactForm">
<label>
<span>Your name</span>
<input type="text" name="name" placeholder="Your name">
</label>
</form>
我正在做这个计算
user = pd.DataFrame({'User':['101','101','101','102','102','101','101','102','102','102'],'Country':['India','Japan','India','Brazil','Japan','UK','Austria','Japan','Singapore','UK'],'Count':[50,1,2,5,6,89,10.9,10,5,6]})
像这样我有很多用户如何在函数中进行这些计算并循环所有用户?
我正在寻找代码方面的改进。因为在这里,我首先要为每个用户分离出数据帧,然后进行计算即可通过用户ID并获得所需的输出user_v2数据帧?
谢谢。
答案 0 :(得分:4)
您可以通过首先执行<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<nav>
<a class="categoryLink active" category="1">Braces</a>
<a class="categoryLink" category="2">Mobility</a>
<a class="categoryLink" category="3">Incontinence</a>
</nav>
<ul class="categoryProducts" category="1">
<li><button onclick="product(1)"><h4>Knee Brace L1843</h4></button></li>
<li><button onclick="product(2)"><h4>Wrist Brace L3807</h4></button></li>
<li><button onclick="product(3)"><h4>Wrist Brace</h4></button></li>
<li><button onclick="product(4)"><h4>Ankle Brace L1005</h4></button></li>
<li><button onclick="product(5)"><h4>Back Brace L0650</h4></button></li>
</ul>
<ul class="categoryProducts" category="2" style="display:none">
<li><button onclick="product(6)"><h4>Back Brace L0650</h4></button></li>
</ul>
<ul class="categoryProducts" category="3" style="display:none">
<li><button onclick="product(7)"><h4>Back Brace L0650</h4></button></li>
</ul>
<div product="1" class="productDescr">
<h2>Knee Brace L1843</h2>
<p>Product Info</p>
</div>
<div product="2" class="productDescr">
<h2>Wrist Brace L3807</h2>
<p>Product Info</p>
</div>
<div product="3" class="productDescr">
<h2>Wrist Brace</h2>
<p>Product Info</p>
</div>
<div product="4" class="productDescr">
<h2>Ankle Brace L1005</h2>
<p>Product Info</p>
</div>
<div product="5" class="productDescr">
<h2>Back Brace L0650</h2>
<p>Product Info</p>
</div>
<div product="6" class="productDescr">
<h2>Back Brace L0650</h2>
<p>Product Info</p>
</div>
<div product="7" class="productDescr">
<h2>Back Brace L0650</h2>
<p>Product Info</p>
</div>
为所有用户执行操作。然后,不要使用行groupby
来分配功能。
np.select
import pandas as pd
import numpy as np
user['Percentile'] = user.groupby('User').Count.rank(pct=True, ascending=True)*100
user['group'] = np.select([user.Percentile<33, user.Percentile<66, user.Percentile>=66], [1,2,3])
现在是:
user
答案 1 :(得分:1)
您可以使用User
和rank
函数上的group by来获得百分位数,并且由于您基于百分位数的相等划分来分配1,2,3的组等级,因此您也可以乘以3并使用math.ceil
或numpy.ceil
user['Percentile'] = user.groupby('User').Count.rank(pct=True) * 100
user['group'] = (user.Percentile * 3 / 100).apply(np.ceil)
产生输出:
User Country Count Percentile group
0 101 India 50.0 80.0 3
1 101 Japan 1.0 20.0 1
2 101 India 2.0 40.0 2
3 102 Brazil 5.0 30.0 1
4 102 Japan 6.0 70.0 3
5 101 UK 89.0 100.0 3
6 101 Austria 10.9 60.0 2
7 102 Japan 10.0 100.0 3
8 102 Singapore 5.0 30.0 1
9 102 UK 6.0 70.0 3