Question

我刚开始使用python＆amp;大熊猫。我搜索谷歌和堆栈溢出来回答我的问题，但一直无法找到。这就是我需要做的事情：

我有一个df，每个人有几个数据行（id）和一个名为response_go的变量，可以编码为1或0（类型为int64），例如这个（只有更大，每人480行... ）

   ID response_go
0  1     1
1  1     0
2  1     0
3  1     1
4  2     1
5  2     0
6  2     1
7  2     1

现在，我想检查每个ID / person是否分别在response_go中的条目全部编码为0，全部编码为1，或者两者都没有（其他条件）。到目前为止，我已经想出了这个：

    ids = df['ID'].unique()

    for id in ids:   
        if (df.response_go.all() == 1): 
            print "ID:",id,": 100% Go"
        elif (df.response_go.all() == 0):
            print "ID:",id,": 100% NoGo"
    else:
        print "ID:",id,": Mixed Response Pattern"

但是，它给了我以下输出：

ID: 1 : 100% NoGo
ID: 2 : 100% NoGo
ID: 2 : Mixed Response Pattern

应该是什么时候（包括1和0）

ID: 1 : Mixed Response Pattern
ID: 2 : Mixed Response Pattern

如果以前可能会问过这个问题但是在寻找答案时，我真的很抱歉，我真的没有找到解决这个问题的方法。如果之前已经回答过，请指出解决方案。谢谢大家！！！！真的很感激！

Answer 1

示例（包含不同的数据） -

df = pd.DataFrame({'ID' : [1] * 3 + [2] * 3 + [3] * 3, 
                   'response_go' : [0, 0, 0, 1, 1, 1, 0, 1, 0]})
df

   ID  response_go
0   1            0
1   1            0
2   1            0
3   2            1
4   2            1
5   2            1
6   3            0
7   3            1
8   3            0

使用groupby + mean -

v = df.groupby('ID').response_go.mean()
v

ID
1    0.000000
2    1.000000
3    0.333333
Name: response_go, dtype: float64

使用np.select根据response_go -

的平均值计算您的状态

u = np.select([v == 1, v == 0, v < 1], ['100% Go', '100% NoGo', 'Mixed Response Pattern'])

或者，使用嵌套的np.where做同样的事情（稍微快一点） -

u = np.where(v == 1, '100% Go', np.where(v == 0, '100% NoGo', 'Mixed Response Pattern'))

现在，将结果分配回去 -

v[:] = u
v

ID
1                 100% NoGo
2                   100% Go
3    Mixed Response Pattern
Name: response_go, dtype: object

检查列中所有值的几个条件

1 个答案: