根据其他列中的值将列中的值量更改为np.nan

时间:2019-01-10 18:40:46

标签: python pandas dataframe

就像标题状态一样,我想基于“剪切”中的值将“ Vorgabe”和“临时”中的值更改为nan。

这是一个预期结果的示例。我认为这是了解我的问题的最好方法。

     OptOpTemp  OpTemp  Grad  Vorgabe  Temp  Cut
0        22.5      24   0.0     22.5    24   0.0
1        22.5      24   0.0     22.5    24   0.0
2        23.5      24   1.0     23.5    24   1.0
3        23.5      24   0.0     23.5    24   0.0
4        23.5      24   0.0     23.5    24   0.0
5        23.5      24   0.0     23.5    24   0.0
6        23.5      24   0.0     23.5    24   0.0
7        23.5      24   0.0     23.5    24   0.0
8        23.5      24   0.0     23.5    24   0.0
9        26.0      24   2.5     26.0    24   3.0
10       26.0      24   0.0     26.0    24   0.0
11       26.0      24   0.0     26.0    24   0.0
12       26.0      24   0.0     26.0    24   0.0
13       26.0      24   0.0     26.0    24   0.0

我想将此更改为:

     OptOpTemp  OpTemp  Grad  Vorgabe  Temp  Cut
0        22.5      24   0.0     22.5    24   0.0
1        22.5      24   0.0     nan    nan   0.0 <- one row above
2        23.5      24   1.0     nan    nan   1.0
3        23.5      24   0.0     nan    nan   0.0 <- one row among
4        23.5      24   0.0     23.5    24   0.0
5        23.5      24   0.0     23.5    24   0.0
6        23.5      24   0.0     nan    nan   0.0 <- three rows above
7        23.5      24   0.0     nan    nan   0.0
8        23.5      24   0.0     nan    nan   0.0
9        26.0      24   2.5     nan    nan   3.0
10       26.0      24   0.0     nan    nan   0.0
11       26.0      24   0.0     nan    nan   0.0
12       26.0      24   0.0     nan    nan   0.0 <- three rows among
13       26.0      24   0.0     26.0    24   0.0

上方和中间的行数基于“剪切”中的值。

我想为'Cut'中的每个整数值创建一个。因此,如果出现2,则应在上方切两行,中间切两行。因此,我认为需要循环,但我不知道如何实现正确的结果...

1 个答案:

答案 0 :(得分:1)

一种方法是创建一个索引数组并通过它们屏蔽:

# define in-scope cuts
cuts = df.loc[df['Cut'] != 0, 'Cut'].astype(int)

# calculate array of indices
idx = np.hstack([np.arange(i - val, i + val + 1) for i, val in cuts.items()])

# mask series according to indices
df.loc[idx, ['Vorgabe', 'Temp']] = np.nan

# alternative: use if index ranges may fall out of dataframe index
# df.loc[df.index.isin(idx), ['Vorgabe', 'Temp']] = np.nan

print(df)

#     OptOpTemp  OpTemp  Grad  Vorgabe  Temp  Cut
# 0        22.5      24   0.0     22.5  24.0  0.0
# 1        22.5      24   0.0      NaN   NaN  0.0
# 2        23.5      24   1.0      NaN   NaN  1.0
# 3        23.5      24   0.0      NaN   NaN  0.0
# 4        23.5      24   0.0     23.5  24.0  0.0
# 5        23.5      24   0.0     23.5  24.0  0.0
# 6        23.5      24   0.0      NaN   NaN  0.0
# 7        23.5      24   0.0      NaN   NaN  0.0
# 8        23.5      24   0.0      NaN   NaN  0.0
# 9        26.0      24   2.5      NaN   NaN  3.0
# 10       26.0      24   0.0      NaN   NaN  0.0
# 11       26.0      24   0.0      NaN   NaN  0.0
# 12       26.0      24   0.0      NaN   NaN  0.0
# 13       26.0      24   0.0     26.0  24.0  0.0