如何有效地在熊猫中映射新变量

时间:2018-07-17 06:45:46

标签: python pandas dataframe

这是我的数据

Id  Amount
1   6
2   2
3   0
4   6

我需要的是映射:如果Amount大于3,则Map1。但是,如果Amount小于3,则Map0

Id  Amount   Map
1   6        1
2   2        0
3   0        0
4   5        1

我做了

a = df[['Id','Amount']]
a = a[a['Amount'] >= 3]
a['Map'] = 1
a = a[['Id', 'Map']]
df=  df.merge(a, on='Id', how='left')
df['Amount'].fillna(0)

它可以工作,但不是高度可配置且无效。

1 个答案:

答案 0 :(得分:2)

将布尔型掩码转换为整数:

#for better performance convert to numpy array
df['Map'] = (df['Amount'].values >= 3).astype(int)
#pure pandas solution
df['Map'] = (df['Amount'] >= 3).astype(int)
print (df)
   Id  Amount  Map
0   1       6    1
1   2       2    0
2   3       0    0
3   4       6    1

性能

#[400000 rows x 3 columns]
df = pd.concat([df] * 100000, ignore_index=True)

In [133]: %timeit df['Map'] = (df['Amount'].values >= 3).astype(int)
2.44 ms ± 97.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [134]: %timeit df['Map'] = (df['Amount'] >= 3).astype(int)
2.6 ms ± 66.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)