字符串的ndarray到浮点类型

时间:2018-11-16 00:01:39

标签: python pandas

我做到了

cf = df.iloc[:,1:12]
cf = cf.values
print(cf)

这给了我

[['$0.00 ' '$771.98 ' '$0.00 ' ..., '$771.98 ' '$0.00 ' '$1,543.96 ']
 ['$1,320.83 ' '$4,782.33 ' '$1,320.83 ' ..., '$1,954.45 ' '$0.00 '
  '$1,954.45 ']
 ['$2,043.61 ' '$0.00 ' '$4,087.22 ' ..., '$4,662.30 ' '$2,907.82 '
  '$1,549.53 ']
 ..., 
 ['$427.60 ' '$0.00 ' '$427.60 ' ..., '$427.60 ' '$0.00 ' '$427.60 ']
 ['$868.58 ' '$1,737.16 ' '$0.00 ' ..., '$868.58 ' '$868.58 ' '$868.58 ']
 ['$0.00 ' '$1,590.07 ' '$0.00 ' ..., '$787.75 ' '$0.00 ' '$0.00 ']]

我需要这些是浮动类型。这是不可能的重复,因为cf变量是NDarray而不是数据帧。

我尝试这样做:

cf = df.iloc[:,1:12].replace('[\$,]', '', regex=True).astype(float)
cf = cf.values
print(cf)

但是我得到这些错误:

ValueError                                Traceback (most recent call last)
<ipython-input-152-f5009cb31652> in <module>()
      1 # Place as_of_date and cash flows into an unordered_map or dictionary
----> 2 cf = df.iloc[:,1:12].replace('[\$,]', '', regex=True).astype(float)
      3 cf = cf.values
      4 print(cf)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs)
   3408         # else, only a single dtype is given
   3409         new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 3410                                      **kwargs)
   3411         return self._constructor(new_data).__finalize__(self)
   3412 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in astype(self, dtype, **kwargs)
   3222 
   3223     def astype(self, dtype, **kwargs):
-> 3224         return self.apply('astype', dtype=dtype, **kwargs)
   3225 
   3226     def convert(self, **kwargs):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3089 
   3090             kwargs['mgr'] = self
-> 3091             applied = getattr(b, f)(**kwargs)
   3092             result_blocks = _extend_blocks(applied, result_blocks)
   3093 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in astype(self, dtype, copy, errors, values, **kwargs)
    469     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    470         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 471                             **kwargs)
    472 
    473     def _astype(self, dtype, copy=False, errors='raise', values=None,

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in _astype(self, dtype, copy, errors, values, klass, mgr, raise_on_error, **kwargs)
    519 
    520                 # _astype_nansafe works fine with 1-d only
--> 521                 values = astype_nansafe(values.ravel(), dtype, copy=True)
    522                 values = values.reshape(self.shape)
    523 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy)
    634 
    635     if copy:
--> 636         return arr.astype(dtype)
    637     return arr.view(dtype)
    638 

ValueError: could not convert string to float: '(641.99)'

我不确定如何解决此问题,请修改答案,以便我可以解决此问题并继续进行其他操作。

从建议的答案中我做到了

cf = df.iloc[:,1:12].replace('[^0-9]', '', regex=True).astype(float)
cf = cf.values
print(cf)

这给了我

[[      0.   77198.       0. ...,   77198.       0.  154396.]
 [ 132083.  478233.  132083. ...,  195445.       0.  195445.]
 [ 204361.       0.  408722. ...,  466230.  290782.  154953.]
 ..., 
 [  42760.       0.   42760. ...,   42760.       0.   42760.]
 [  86858.  173716.       0. ...,   86858.   86858.   86858.]
 [      0.  159007.       0. ...,   78775.       0.       0.]]

值不正确,需要调整。

1 个答案:

答案 0 :(得分:2)

您可以这样做:

print(df.replace('[\$,]', '', regex=True).astype(float))

然后您会得到想要的一个。

更新

DO:

print(df.replace('[^0-9.]', '', regex=True).astype(float))

然后:

print(df)

根据需要。