Question

我在pandas df中有一张桌子

id_x  id_y
a      b
b      c
a      c
d      a
x      a
m      b
c      z
a      k
b      q
d      w
a      w
q      v

如何阅读此表是：

a，a-b，a-c，a-k，a-w的组合，类似于b（b-c，b-q）等。我想编写一个从df def test_func(id)

获取id_x的函数

并检查该ID的出现次数是否大于3，这可以由df['id_x'].value_counts完成。

例如。

def test_func(id):
    if id_count >= 3:
       print 'yes'
       ddf = df[df['id_x'] == id]
       ddf.to_csv(id+".csv")
    else:
       print 'no'
       while id_count <3:
           # do something.(I've explained below what I have to do when count<3)

假设b的出现次数仅为2（即b-c和b-q），小于3。

所以在这种情况下，看看＆＃39;（来自id_y）是否有任何组合。

c具有1个组合（c-z），并且类似地q具有1个组合（q-v）

因此b应与z和v链接。

id_x   id_y
b       c
b       q
b       z
b       v

并将其存储在ddf2中，就像我们为＆gt; 10存储一样。

对于特定的id，如果我可以使用id的名称保存csv。我希望我正确地解释了我的问题，我是python的新手，我不知道编写函数，这是我的逻辑。

任何人都可以帮我实现部分。提前谢谢。

Answer 1

编辑根据评论重新设计解决方案

 public override async Task GrantResourceOwnerCredentials(OAuthGrantResourceOwnerCredentialsContext context)
        {
            var userManager = context.OwinContext.GetUserManager<ApplicationUserManager>();

            ApplicationUser user = await userManager.FindAsync(context.UserName, context.Password);


            if (user == null)
            {
                context.SetError("invalid_grant", "The user name or password is incorrect.");
                return;
            }
            //Insecure message returned, you show to outside world the email exist, 
            //any how here is where you  stop the token generation if email of the useris not confirmed, 
            //going forward if you have speciffic role where this validation is not required 
            //check if user is in that role before checking if user has email confirmed 
            if (!user.EmailConfirmed)
            {
                context.SetError("invalid_grant", "Email is not confirmed");
                return;
            }
            //... rest of code
}

Answer 2

DataFrame之前的filter length {用于测试< 3）

a = df.groupby('id_x').filter(lambda x: len(x) < 3)
print (a)
   id_x id_y
1     b    c
3     d    a
4     x    a
5     m    b
6     c    z
8     b    q
9     d    w
11    q    v

然后过滤b并重命名列：

a1 = a.query("id_x == 'b'").rename(columns={'id_y':'id'})
print (a1)
  id_x id
1    b  c
8    b  q

同时过滤不是b的地方：

a2 = a.query("id_y != 'b'").rename(columns={'id_x':'id'})
print (a2)
   id id_y
1   b    c
3   d    a
4   x    a
6   c    z
8   b    q
9   d    w
11  q    v

然后id列b = pd.merge(a1,a2, on='id').drop('id', axis=1) print (b) id_x id_y 0 b z 1 b v：

上次merge按b过滤到新数据框c = pd.concat([a.query("id_x == 'b'"), b]) print (c) id_x id_y 1 b c 8 b q 0 b z 1 b v：

{{1}}

用于获取一列与其他列的组合的复杂功能

2 个答案: