Pandas中是否存在元素IIF功能?
E.g。给出一个数据帧:
R.omit('b', x)
如果元素> 0.2,设置为1,否则设置为0.如下所示:
w = pd.DataFrame({'Date':pd.to_datetime(['2016-01-01','2016-01-02','2016-01-03']),'A1':[0.3,0.1,0.1],'A2':[0.4,0.4,0.4]}).set_index(['Date'])
有mask()/ where(),但真值来自旧数据帧。
答案 0 :(得分:2)
您需要与0.2
和boolean DataFrame
投放到np.uint8
进行比较:
print (w > .2)
A1 A2
Date
2016-01-01 True True
2016-01-02 False True
2016-01-03 False True
w1 = (w > .2).astype(np.uint8)
print (w1)
A1 A2
Date
2016-01-01 1 1
2016-01-02 0 1
2016-01-03 0 1
print (w.gt(.2).astype(np.uint8))
A1 A2
Date
2016-01-01 1 1
2016-01-02 0 1
2016-01-03 0 1
比较解决方案:
#[300000 rows x 2 columns]
#for testing index is not necessary
w = pd.concat([w]*100000).reset_index(drop=True)
In [49]: %timeit ((w > .2).astype(int))
100 loops, best of 3: 2.11 ms per loop
In [50]: %timeit ((w > .2).astype(np.short))
1000 loops, best of 3: 1.8 ms per loop
In [51]: %timeit ((w > .2).astype(np.uint8))
1000 loops, best of 3: 1.35 ms per loop
In [82]: %timeit (w.gt(.2).astype(np.uint8))
1000 loops, best of 3: 1.02 ms per loop
In [52]: %timeit (w.applymap(lambda x: 1 if x>0.2 else 0))
1 loop, best of 3: 334 ms per loop
感谢piRSquared
寻求其他解决方案:
pd.DataFrame((w.values > .2).astype(np.uint8), w.index, w.columns)
In [112]: %timeit (pd.DataFrame((w.values > .2).astype(np.uint8), w.index, w.columns))
1000 loops, best of 3: 877 µs per loop
答案 1 :(得分:2)
jezrael的回答是我将用于此的答案 或者,您也可以使用DataFrame.applymap函数。
w.applymap(lambda x: 1 if x>0.2 else 0)