Question

我试图根据A，B，C，D中的值添加一列“ flag_column”。

也就是说，如果A / B / C / D中有一个值，我想创建一个新列，“ flag”表示包含值的列名。

  A B C D counts flag
0 1 0 0 0  1     A
1 0 1 0 0  1     B
2 1 0 0 0  1     A
3 0 0 1 0  1     C
4 0 1 0 0  1     B

注意：只有一列（A到D）包含一个值，因此计数始终为1。

我尝试过：

if [df['A'] == 1] == True:
    df['flag'] = 'A'
elif [df['B'] == 1] == True:
    df['flag'] = 'B'
elif [df['C'] == 1] == True:
    df['flag'] = 'C'  
else:
    df['flag'] = 'D'

我也尝试过：

df['flag'] = np.where(df['A'] == 1, 'A', False)
df['flag'] = np.where(df['B'] == 1, 'B', False)
df['flag'] = np.where(df['C'] == 1, 'C', False)
df['flag'] = np.where(df['D'] == 1, 'D', False)

我还尝试了遍历每个“类别”并分配一个标志值，但是在这种情况下它也会覆盖。

如果有一种方法可以迭代地执行此操作，那将是理想的。但是，对这个（简单）问题的任何帮助将不胜感激！

Answer 1

我们可以在idxmax上使用axis=1，

df['flag'] = df.loc[:, 'A':'D'].idxmax(axis=1)

   A  B  C  D flag
0  1  0  0  0    A
1  0  1  0  0    B
2  1  0  0  0    A
3  0  0  1  0    C
4  0  1  0  0    B

Answer 2

尝试使用dot

df['flag'] = df.loc[:,'A':'D'].dot(df.columns[:4])
Out[108]: 
0    A
1    B
2    A
3    C
4    B
dtype: object

Answer 3

将np.select用于多个条件：

df['flag'] = np.select([df['A'] == 1, df['B'] == 1, df['C'] == 1, df['D'] == 1],
                       ['A','B','C','D'],
                       False)
df

Out[1]:
    A   B   C   D   counts  flag
0   1   0   0   0   1       A
1   0   1   0   0   1       B
2   1   0   0   0   1       A
3   0   0   1   0   1       C
4   0   1   0   0   1       B

但是对于np.where，这是您出错的地方。您应该只第一次编写False，然后将列的值用作所有其余np.where语句的替代项：

df['flag'] = np.where(df['A'] == 1, 'A', False)
df['flag'] = np.where(df['B'] == 1, 'B', df['flag'])
df['flag'] = np.where(df['C'] == 1, 'C', df['flag'])
df['flag'] = np.where(df['D'] == 1, 'D', df['flag'])

Out[2]:
    A   B   C   D   counts  flag
0   1   0   0   0   1       A
1   0   1   0   0   1       B
2   1   0   0   0   1       A
3   0   0   1   0   1       C
4   0   1   0   0   1       B

如您所见，np.select更为简洁。

Answer 4

df['flag'] = np.where(df['A'] == 1, 'A', 
    np.where(df['B'] == 1, 'B',
    np.where(df['C'] == 1, 'C',
    np.where(df['D'] == 1, 'D', '?'))))

如何根据其他列的条件分配新列？

4 个答案: