np.where和if in pandas的组合

时间:2018-09-23 13:25:12

标签: python pandas numpy dataframe

我在熊猫中有以下数据框

   prod     S       X
   a        10      123             
   b        20      150
   b        30      140
   a        40      100

   Formula for product a and b is as follows
   a = IF(S>X, (0.6/100(S-X)),0)
   b = IF(S>X, (0.2/100(S-X)),0)

如何根据现有数据框中产品a和b的公式计算new column

4 个答案:

答案 0 :(得分:2)

您可以先使用np.where,然后再使用np.select。来自@ AnnaIliukovich-Strakovskaia的数据。

a = np.where(df['S'] > df['X'], 0.6/100*(df['S'] - df['X']), 0)
b = np.where(df['S'] > df['X'], 0.2/100*(df['S'] - df['X']), 0)

df['result'] = np.select([df['prod'].eq('a'), df['prod'].eq('b')], [a, b], np.nan)

print(df)

  prod    S    X  result
0    a   10  123    0.00
1    b   20  150    0.00
2    b   30  140    0.00
3    a  140  100    0.24

答案 1 :(得分:0)

您可以将apply与定义的功能一起使用。

数据:

df = pd.DataFrame({'prod':['a','b','b','a'] ,
              'S':[10,20,30,140],    
              'X':[123,150,140,100]})

     S    X prod
0   10  123    a
1   20  150    b
2   30  140    b
3  140  100    a

功能:

def func(df):
    result = 0
    if df.S > df.X:
        if df['prod'] == 'a':
            result = 0.6/100*(df.S-df.X)
        if df['prod'] == 'b':
            result = 0.2/100*(df.S-df.X)
    return result

使用它:

df.join(df.apply(func, axis=1).rename('col'))

结果:

     S    X prod   col
0   10  123    a  0.00
1   20  150    b  0.00
2   30  140    b  0.00
3  140  100    a  0.24

答案 2 :(得分:0)

如果您正在寻找速度,那么如果不那么可读,它将更快。 where和if通过布尔索引完成。

设置

import pandas as pd
import numpy as np

df = pd.DataFrame({'prod':['a','b','b','a'] ,
                   'S':[10,20,30,140],    
                   'X':[123,150,140,100]})

print(df)

  prod    S    X
0    a   10  123
1    b   20  150
2    b   30  140
3    a  140  100

代码

# make an array to hold results
results = np.zeros(len(df))

# make arrays from df values
SX_vals = df[['S', 'X']].values
prod = df['prod'].values

# product multiplier dictionary
prod_dict = {'a': .006, 'b': .002}

# make array of S - X
sub_result = np.subtract(SX_vals[:,0], SX_vals[:,1])

# make boolean mask of subtraction results are positive
s_bigger = (sub_result > 0)

# loop through products (keys) of prod_dict
for key in prod_dict.keys():
    # mask where (S-X) > 0 and prod == key 
    mask = s_bigger & (prod == key)
    # multiply and insert into result array
    results[mask] = sub_result[mask] * prod_dict[key]

# assign result array to dataframe
df['result'] = results

结果

print(df)

  prod    S    X  result
0    a   10  123    0.00
1    b   20  150    0.00
2    b   30  140    0.00
3    a  140  100    0.24

答案 3 :(得分:0)

pandas.Series.mappandas.Series.where

d = {'a': .6, 'b': .2}
df.assign(
    result=df['prod'].map(d).mul(df.S - df.X).where(df.S > df.X, 0) / 100
)

  prod    S    X  result
0    a   10  123    0.00
1    b   20  150    0.00
2    b   30  140    0.00
3    a  140  100    0.24