如何在pandas 0.24.2的float列中插入字符串值?

时间:2019-04-27 00:13:23

标签: python-3.x pandas

我的一列浮标超过一百万。当某个值高于或低于某些阈值时,我需要能够用字符串替换某些值。

import pandas as pd

import numpy as np

df = pd.DataFrame({'foo': np.random.random(10),
                   'bar': np.random.random(10)})

df
Out[115]: 
        foo       bar
0  0.181262  0.890826
1  0.321260  0.053619
2  0.832247  0.044459
3  0.937769  0.855299
4  0.752133  0.008980
5  0.751948  0.680084
6  0.559528  0.785047
7  0.615597  0.265483
8  0.129505  0.509945
9  0.727209  0.786113

df.at[5, 'foo'] = 'somestring'
Traceback (most recent call last):

  File "<ipython-input-116-bf0f6f9e84ac>", line 1, in <module>
    df.at[5, 'foo'] = 'somestring'

  File "/Users/nate/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 2287, in __setitem__
    self.obj._set_value(*key, takeable=self._takeable)

  File "/Users/nate/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2815, in _set_value
    engine.set_value(series._values, index, value)

  File "pandas/_libs/index.pyx", line 95, in pandas._libs.index.IndexEngine.set_value

  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.set_value

ValueError: could not convert string to float: 'somestring'

我最终将需要编写如下内容:

for idx, row in df.iterrows()
    if row[0] > some_value:
        df.at[idx, 'foo'] = 'over_some_value'
    else:

我尝试使用iloc,但是我怀疑这样做会很慢,并且我希望能够使用at来保持我的代码统一。

1 个答案:

答案 0 :(得分:1)

要为type分配不同的columns值,您可能需要将其转换为object

在此警告,由于转换为object,非常危险

df=df.astype(object)
df.at[5, 'foo'] = 'somestring'
df
          foo        bar
0    0.163246   0.803071
1    0.946447    0.48324
2    0.777733   0.461704
3    0.996791   0.521338
4    0.320627   0.374384
5  somestring   0.987591
6    0.388765   0.726807
7    0.362077    0.76936
8    0.738139  0.0539076
9    0.208691   0.812568