使用熊猫python 35将对象类型列转换为float32类型

时间:2018-07-23 08:50:07

标签: python python-3.x pandas

这是我尝试过的结果,但出现错误:

df.info()
df[['volume', 'open', 'high', 'low', 'close']] =pd.Series( df[['volume', 'open', 'high', 'low', 'close']], dtype='float32')

输出错误:

<class 'pandas.core.frame.DataFrame'>
Index: 4999 entries, 2018-06-01T00:01:00.000000000Z to 2018-06-06T14:20:00.000000000Z
Data columns (total 6 columns):
volume      4999 non-null object
close       4999 non-null object
high        4999 non-null object
low         4999 non-null object
open        4999 non-null object
complete    4999 non-null object
dtypes: object(6)
memory usage: 273.4+ KB
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
c:\python35\lib\site-packages\pandas\core\common.py in _asarray_tuplesafe(values, dtype)
    398                 result = np.empty(len(values), dtype=object)
--> 399                 result[:] = values
    400             except ValueError:

ValueError: could not broadcast input array from shape (4999,5) into shape (4999)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-23-b205607bf5cf> in <module>()
      1 df[['volume', 'open', 'high', 'low', 'close']].iloc[50:60] #, 'complete'
      2 df.info()
----> 3 df[['volume', 'open', 'high', 'low', 'close']] =pd.Series( df[['volume', 'open', 'high', 'low', 'close']], dtype='float32')
      4 # df = pd.to_numeric(df, errors='ignore')

c:\python35\lib\site-packages\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    262             else:
    263                 data = _sanitize_array(data, index, dtype, copy,
--> 264                                        raise_cast_failure=True)
    265 
    266                 data = SingleBlockManager(data, index, fastpath=True)

c:\python35\lib\site-packages\pandas\core\series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   3275             raise Exception('Data must be 1-dimensional')
   3276         else:
-> 3277             subarr = _asarray_tuplesafe(data, dtype=dtype)
   3278 
   3279     # This is to prevent mixed-type Series getting all casted to

c:\python35\lib\site-packages\pandas\core\common.py in _asarray_tuplesafe(values, dtype)
    400             except ValueError:
    401                 # we have a list-of-list
--> 402                 result[:] = [tuple(x) for x in values]
    403 
    404     return result

ValueError: cannot copy sequence with size 5 to array axis with dimension 4999

请告诉我我到底可以做些什么。

1 个答案:

答案 0 :(得分:1)

您可以对列的子集使用astype

df = pd.DataFrame({'A':list('abcdef'),
                   'low':[4,5,4,5,5,4],
                   'high':[7,8,9,4,2,3],
                   'open':[1,3,5,7,1,0],
                   'volume':[5,3,6,9,2,4],
                   'close':[5,3,6,9,2,4],
                   'F':list('aaabbb')}).astype(str)

print (df.dtypes)
A         object
low       object
high      object
open      object
volume    object
close     object
F         object
dtype: object

cols = ['volume', 'open', 'high', 'low', 'close']
df[cols] = df[cols].astype(np.float32)

print (df.dtypes)
A          object
low       float32
high      float32
open      float32
volume    float32
close     float32
F          object
dtype: object