我有一个带有两列x和y的df。列y是x值的计数。 x值具有不同的计数。如何在不迭代行的情况下获得每个x的前两个y计数的结果。
示例df:
df = pd.DataFrame({"x": [101, 101, 101, 101, 201, 201, 201, 405, 405], "y": [1, 2, 3, 4, 1, 2, 3, 1, 2]})
x y
0 101 1
1 101 2
2 101 3
3 101 4
4 201 1
5 201 2
6 201 3
7 405 1
8 405 2
期望的结果:
x y
101 3
101 4
201 2
201 3
405 1
405 2
答案 0 :(得分:1)
你可以这样做:
def clean_dob(value):
pass
class MyModel(ndb.Model):
# ...
因此In [35]:
df.loc[df.groupby(['x'])['y'].apply(lambda x: x.iloc[-2:]).index.get_level_values(1)]
Out[35]:
x y
2 101 3
3 101 4
5 201 2
6 201 3
7 405 1
8 405 2
在' x'列并返回最后2个值,假设df已经按照您的显示排序。这会生成带有多索引的df,并且可以使用groupby
修改强>
要回答您的评论,您可以get_level_values
再次使用groupby
与transform
一起将值重置为rank
和1
:
2
答案 1 :(得分:0)
如果您的数据框没有排序,这是一个解决方案:
<link rel="shortcut icon" type="image/vnd.microsoft.icon" href="//s1.wp.com/i/favicon.ico" sizes="16x16 32x32">
<link rel="shortcut icon" type="image/x-icon" href="//s1.wp.com/i/favicon.ico" sizes="16x16 32x32">
<link rel="icon" type="image/x-icon" href="//s1.wp.com/i/favicon.ico" sizes="16x16 32x32">
<link rel="icon" type="image/png" href="//s1.wp.com/i/favicons/favicon-64x64.png" sizes="64x64">
<link rel="icon" type="image/png" href="//s1.wp.com/i/favicons/favicon-96x96.png" sizes="96x96">
<link rel="icon" type="image/png" href="//s1.wp.com/i/favicons/android-chrome-192x192.png" sizes="192x192">
<link rel="apple-touch-icon" sizes="57x57" href="//s1.wp.com/i/favicons/apple-touch-icon-57x57.png"><link rel="apple-touch-icon" sizes="60x60" href="//s1.wp.com/i/favicons/apple-touch-icon-60x60.png">
<link rel="apple-touch-icon" sizes="72x72" href="//s1.wp.com/i/favicons/apple-touch-icon-72x72.png"><link rel="apple-touch-icon" sizes="76x76" href="//s1.wp.com/i/favicons/apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" sizes="114x114" href="//s1.wp.com/i/favicons/apple-touch-icon-114x114.png">
<link rel="apple-touch-icon" sizes="120x120" href="//s1.wp.com/i/favicons/apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" sizes="144x144" href="//s1.wp.com/i/favicons/apple-touch-icon-144x144.png">
<link rel="apple-touch-icon" sizes="152x152" href="//s1.wp.com/i/favicons/apple-touch-icon-152x152.png">
<link rel="apple-touch-icon" sizes="180x180" href="//s1.wp.com/i/favicons/apple-touch-icon-180x180.png">
不幸的是,In [1]: df.groupby('x')['y'].nlargest(2)
Out[1]:
x
101 3 4
2 3
201 6 3
5 2
405 8 2
7 1
dtype: int64
无法应用于分组数据框,因此需要进行一些重新格式化。