列值取决于在熊猫中具有条件的另一列

时间:2021-05-04 13:37:26

标签: python pandas numpy data-science

我有一个示例数据:

datetime             temperature   season
2021-04-10 01:00:00.    10.        Heating season
2021-04-10 01:00:00.    26.        Heating season
2021-07-10 01:00:00.    16.        Cooling season
2021-07-10 01:00:00.    30.        Cooling season

我想创建一个名为 new_temperature 的新列:a) 如果温度列小于 18 并且季节是采暖季节,则 new_temperature 应为 25,否则为 18 如果是冷却季节。 b) 如果温度列大于 25 且季节为冷季,则 new_temperature 列应为 18,否则为 22 为采暖季。

示例输出如下所示:

datetime             temperature   season.         new_temperature
2021-04-10 01:00:00.    10.        Heating season.    25
2021-04-10 01:00:00.    26.        Heating season.    22
2021-07-10 01:00:00.    16.        Cooling season.    18
2021-07-10 01:00:00.    30.        Cooling season.    18

1 个答案:

答案 0 :(得分:3)

np.select 有 4 个条件:

cond_1 = (df.temperature < 18) & (df.season == "Heating season")
cond_2 = (df.temperature < 18) & (df.season != "Heating season")
cond_3 = (df.temperature > 25) & (df.season == "Cooling season")
cond_4 = (df.temperature > 25) & (df.season != "Cooling season")

conditions = [cond_1, cond_2, cond_3, cond_4]
choices = [25, 18, 18, 22]

df["new_temperature"] = np.select(conditions, choices)

得到

               datetime  temperature          season  new_temperature
0  2021-04-10 01:00:00.         10.0  Heating season               25
1  2021-04-10 01:00:00.         26.0  Heating season               22
2  2021-07-10 01:00:00.         16.0  Cooling season               18
3  2021-07-10 01:00:00.         30.0  Cooling season               18

注意:由于您的条件不是互斥的,您可能希望为 default 提供一个 np.select 值作为最后一个参数。如果没有条件匹配,则将其放入结果中。

相关问题