如果值以某些开头,我正在尝试更改数据框中的值。 我正在检查前4个值是否为0.00 如果以0.00开头,我想将该值乘以100 下面的公式给我这个错误
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().
我的公式是
Total['Rate']=Total['Rate'].apply(lambda x: Total['Rate']*100 if \
Total['Rate'].str[:4]=='0.00' else Total['Rate'])
答案 0 :(得分:3)
您是如此接近,问题在于在lambda函数中,您试图乘以整列而不是仅乘以值。
通过将它们更改为x(如下所示),就可以了。
Total['Rate'] = Total['Rate'].apply(lambda x: x*100 if str(x)[:4]=='0.00' else x, 1)
希望这会有所帮助!
答案 1 :(得分:1)
不必转换为字符串,最好将乘数值转换为integers
并与0
进行比较:
Total = pd.DataFrame(data=[0.001,0.2,5,0.0002],columns=['Rate'])
s = Total['Rate'] * 100
Total['Rate'] = np.where(s.astype(int) == 0, s, Total['Rate'])
print (Total)
Rate
0 0.10
1 0.20
2 5.00
3 0.02
详细信息:
print (s)
0 0.10
1 20.00
2 500.00
3 0.02
Name: Rate, dtype: float64
print (s.astype(int))
0 0
1 20
2 500
3 0
Name: Rate, dtype: int32
print (s.astype(int) == 0)
0 True
1 False
2 False
3 True
Name: Rate, dtype: bool
性能:
Total = pd.DataFrame(data=[0.001,0.2,5,0.0002],columns=['Rate'])
Total = pd.concat([Total] * 10000, ignore_index=True)
In [296]: %%timeit
...: s = Total['Rate'] * 100
...: Total['Rate'] = np.where(s.round() == 0, s, Total['Rate'])
...:
2.09 ms ± 119 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [297]: %%timeit
...: Total['Rate'] = Total['Rate'].apply(lambda x: x*100 if str(x)[:4]=='0.00' else x, 1)
...:
26.2 ms ± 1.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
编辑:如果要将值设置为多个掩码,例如否定0
,请使用numpy.select
:
Total = pd.DataFrame(data=[0.001,0.2,5,0.0002, -10],columns=['Rate'])
s = Total['Rate'] * 100
mask1 = s.astype(int) == 0
mask2 = Total['Rate'] < 0
Total['Rate'] = np.select([mask1, mask2], [s, 0], default=Total['Rate'])
print (Total)
Rate
0 0.10
1 0.20
2 5.00
3 0.02
4 0.00
答案 2 :(得分:0)
改为使用此:
Total['Rate']=Total['Rate'].mask(Total['Rate'].str.startswith('0.00'), Total['Rate']*100)