我的一列浮标超过一百万。当某个值高于或低于某些阈值时,我需要能够用字符串替换某些值。
import pandas as pd
import numpy as np
df = pd.DataFrame({'foo': np.random.random(10),
'bar': np.random.random(10)})
df
Out[115]:
foo bar
0 0.181262 0.890826
1 0.321260 0.053619
2 0.832247 0.044459
3 0.937769 0.855299
4 0.752133 0.008980
5 0.751948 0.680084
6 0.559528 0.785047
7 0.615597 0.265483
8 0.129505 0.509945
9 0.727209 0.786113
df.at[5, 'foo'] = 'somestring'
Traceback (most recent call last):
File "<ipython-input-116-bf0f6f9e84ac>", line 1, in <module>
df.at[5, 'foo'] = 'somestring'
File "/Users/nate/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 2287, in __setitem__
self.obj._set_value(*key, takeable=self._takeable)
File "/Users/nate/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2815, in _set_value
engine.set_value(series._values, index, value)
File "pandas/_libs/index.pyx", line 95, in pandas._libs.index.IndexEngine.set_value
File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.set_value
ValueError: could not convert string to float: 'somestring'
我最终将需要编写如下内容:
for idx, row in df.iterrows()
if row[0] > some_value:
df.at[idx, 'foo'] = 'over_some_value'
else:
我尝试使用iloc
,但是我怀疑这样做会很慢,并且我希望能够使用at
来保持我的代码统一。
答案 0 :(得分:1)
要为type
分配不同的columns
值,您可能需要将其转换为object
在此警告,由于转换为object
,非常危险
df=df.astype(object)
df.at[5, 'foo'] = 'somestring'
df
foo bar
0 0.163246 0.803071
1 0.946447 0.48324
2 0.777733 0.461704
3 0.996791 0.521338
4 0.320627 0.374384
5 somestring 0.987591
6 0.388765 0.726807
7 0.362077 0.76936
8 0.738139 0.0539076
9 0.208691 0.812568