Question

我有一个数据框，其数据为：

Run_1   Run_2   Run_3 Avg
5.26    6.08    1.8   2
273     0       0     23  
5.26    6.08    1.8   1

有形状

(2928, 501)

我只想将名称中包含子字符串Run_的列的所有值> 0更改为0，并将当前所有值设置为0到1。列数来自Run_1, Run_2, ... Run_500。条件更改不应用于Run_1, Run_2, ... Run_500以外的任何其他列。

因此，所需的输出是：

Run_1   Run_2   Run_3 Avg
0       0        0    2
0       1        1    23  
0       0        0   1

我尝试了以下操作：

    maxGen = np.max(df.filter(regex='Run_').values) + 5555.
    df.loc[df.filter(regex='Run_') > 0] = maxGen

但是我得到了错误：

ValueError: cannot copy sequence with size 500 to array axis with dimension 2928

编辑：数据框中没有负值。

Answer 1

您可以尝试以下方法：

df.assign(**df.filter(like='Run_').eq(0).astype(int))

输出：

   Run_1  Run_2  Run_3  Avg
0      0      0      0    2
1      0      1      1   23
2      0      0      0    1

或者，如果您不喜欢使用“ **”解包，请使用join：

df.filter(like='Run_').eq(0).astype(int).join(df['Avg'])

Answer 2

转换应该起作用

df[[x for x in df.columns if 'Run_' in x]] = df[[x for x in df.columns if 'Run_' in x]].transform(lambda x: x.eq(0).astype(int))

Answer 3

IIUC

df.iloc[:,:-1]=(~df.astype(bool)).astype(int)
df
Out[54]: 
   Run_1  Run_2  Run_3  Avg
0      0      0      0    2
1      0      1      1   23
2      0      0      0    1

熊猫数据框满足双重条件时更改值

3 个答案: