如何根据其他列中的条件将pandas df列中的多个值更改为np.nan?

时间:2019-01-07 16:02:29

标签: python pandas dataframe

我没有太多的编码经验,这是我的第一个问题,所以请耐心等待。我需要找到一种方法,根据另一列中的条件将pandas df列的多个值更改为np.nan。因此,我创建了所需列“ Vorgabe”和“ Temp”的副本。

每当“ Grad”中的值不为0时,我都想将“ Vorgabe”和“ Temp”中定义区域中的值更改为np.nan。

print(df)  

    OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0        22.0    20.0    5   0.0     22.0  20.0
1        22.0    20.5    7   0.0     22.0  20.5
2        22.0    21.0    8   1.0     22.0  21.0
3        22.0    21.0    6   0.0     22.0  21.0
4        22.0    23.5    7   0.0     22.0  20.0
5        23.0    21.5    1   0.0     23.0  21.5
6        24.0    22.5    3   1.0     24.0  22.5
7        24.0    23.0    4   0.0     24.0  23.0
8        24.0    25.5    9   0.0     24.0  25.5

所以我想实现以下目标:

    OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0        22.0    20.0    5   0.0     22.0  20.0
1        22.0    20.5    7   0.0     nan   nan      <-one row above
2        22.0    21.0    8   1.0     nan   nan
3        22.0    21.0    6   0.0     nan   nan      <-one row among
4        22.0    23.5    7   0.0     22.0  20.0
5        23.0    21.5    1   0.0     nan   nan
6        24.0    22.5    3   1.0     nan   nan
7        24.0    23.0    4   0.0     nan   nan
8        24.0    25.5    9   0.0     24.0  25.5

有人可以解决我的问题吗?

编辑:我可能不清楚。目标是将定义区域中“ Vorgabe”和“ Temp”中的每个值更改为nan。在我的示例中,该区域将在上面一行,其中一行是1.0,中间一行。因此,不仅是1.0所在的行,而且还有上下的行。

4 个答案:

答案 0 :(得分:5)

使用loc

df.loc[df.Grad != 0.0, ['Vorgabe', 'Temp']] = np.nan
print(df)

输出

   OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0       22.0    20.0    5   0.0     22.0  20.0
1       22.0    20.5    7   0.0     22.0  20.5
2       22.0    21.0    8   1.0      NaN   NaN
3       22.0    21.0    6   0.0     22.0  21.0
4       22.0    23.5    7   0.0     22.0  20.0
5       23.0    21.5    1   0.0     23.0  21.5
6       24.0    22.5    3   1.0      NaN   NaN
7       24.0    23.0    4   0.0     24.0  23.0
8       24.0    25.5    9   0.0     24.0  25.5

答案 1 :(得分:3)

您可以使用numpy.where

import numpy as np

df['Vorbage']=np.where(df['Grad']!=0, df['OptOpTemp'], np.nan)
df['Temp']=np.where(df['Grad']!=0, df['OpTemp'], np.nan)

答案 2 :(得分:1)

对于|,用bitwise OR约束3个条件,对于1上方和下方的行,请使用shift使用mask:

mask1 = df['Grad'] == 1
mask2 = df['Grad'].shift() == 1
mask3 = df['Grad'].shift(-1) == 1

mask1 = df['Grad'] != 0
mask2 = df['Grad'].shift() != 0
mask3 = df['Grad'].shift(-1) != 0

mask = mask1 | mask2 | mask3

df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
print (df)
   OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0       22.0    20.0    5   0.0     22.0  20.0
1       22.0    20.5    7   0.0      NaN   NaN
2       22.0    21.0    8   1.0      NaN   NaN
3       22.0    21.0    6   0.0      NaN   NaN
4       22.0    23.5    7   0.0     22.0  20.0
5       23.0    21.5    1   0.0      NaN   NaN
6       24.0    22.5    3   1.0      NaN   NaN
7       24.0    23.0    4   0.0      NaN   NaN
8       24.0    25.5    9   0.0     24.0  25.5

多行的一般解决方案:

N = 1
#create range for test value betwen -N to N
r = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
#create boolean mask by comparing with shift and join together by reduce 
mask = np.logical_or.reduce([df['Grad'].shift(x) == 1 for x in r])

df.loc[mask, ['Vorgabe', 'Temp']] = np.nan

编辑:

您可以将两个蒙版连在一起:

N = 1
r1 = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
mask1 = np.logical_or.reduce([df['Grad'].shift(x) == 1 for x in r1])

N = 2
r2 = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
mask2 = np.logical_or.reduce([df['Grad'].shift(x) == 1.5 for x in r2])
#if not working ==1.5 because precision of floats
#mask2 = np.logical_or.reduce([np.isclose(df['Grad'].shift(x), 1.5) for x in r2])

mask = mask1 | mask2
df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
print (df)
   OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0       22.0    20.0    5   0.0     22.0  20.0
1       22.0    20.5    7   0.0      NaN   NaN
2       22.0    21.0    8   1.0      NaN   NaN
3       22.0    21.0    6   0.0      NaN   NaN
4       22.0    23.5    7   0.0      NaN   NaN
5       23.0    21.5    1   0.0      NaN   NaN
6       24.0    22.5    3   1.5      NaN   NaN <- changed value to 1.5
7       24.0    23.0    4   0.0      NaN   NaN
8       24.0    25.5    9   0.0      NaN   NaN

答案 3 :(得分:0)

您可以使用df.apply(f,axis=1),并将f定义为您要在每一行上执行的操作。您的描述似乎在说您想要

 def f(row):
     if row['Grad']!=0:
         row.loc[['Vorgabe','Temp']]=np.nan
     return row

但是,您的示例似乎表明您想要其他东西。