Question

我有两个输入电子表格。

第1页有7列3行

     / FID  / Total       / A1          / B1  /     A2      / B2 
   1 / 1   / 0.720168405 / 0.635589112 / XXX / 0.031112358 / YYY
   1 / 2  / 0.760438562 / 0.328168557 / YYY / 0.311172576 / ZZZ

第2页有2列4行

       /  0
  XXX  /  0.55
  YYY  / 0.52
  ZZZ  / 0.35

这是代码：

import pandas as pd

df = pd.read_excel("C:/Users/Sheet1.xls")

df2 = pd.read_excel("C:/Users/Sheet2.xlsx")

dictionary = df2.to_dict(orient='dict')

b = df.filter(like ='A').values
c = df.filter(like ='B').replace(dictionary[0]).astype(float).values

df['AA'] = ((c * b).sum(axis =1))

df['BB'] = df.AA / df.Total

def custom_round(x, base=5):
    return base * round(float(x)/base)

df['C'] = df['BB'].apply(lambda x: custom_round(x, base=.05))
df['C'] = "X = " + df['C'].apply(lambda s: '{:,.2f}'.format(s))

df.to_excel("C:/Users/Results.xlsx")

print(df)

我收到错误消息：值错误无法将字符串转换为浮点数：XXX

ValueError                                Traceback (most recent call last)
<ipython-input-1-f42c7cb99da5> in <module>()
      8 
      9 b = df.filter(like ='A').values
---> 10 c = df.filter(like ='B').replace(dictionary[0]).astype(float).values
     11 
     12 df['AA'] = ((c * b).sum(axis =1))

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\generic.pyc in astype(self, dtype, copy, errors, **kwargs)
   5689             # else, only a single dtype is given
   5690             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5691                                          **kwargs)
   5692             return self._constructor(new_data).__finalize__(self)
   5693 

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\internals\managers.pyc in astype(self, dtype, **kwargs)
    529 
    530     def astype(self, dtype, **kwargs):
--> 531         return self.apply('astype', dtype=dtype, **kwargs)
    532 
    533     def convert(self, **kwargs):

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\internals\managers.pyc in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
    393                                             copy=align_copy)
    394 
--> 395             applied = getattr(b, f)(**kwargs)
    396             result_blocks = _extend_blocks(applied, result_blocks)
    397 

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\internals\blocks.pyc in astype(self, dtype, copy, errors, values, **kwargs)
    532     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    533         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 534                             **kwargs)
    535 
    536     def _astype(self, dtype, copy=False, errors='raise', values=None,

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\internals\blocks.pyc in _astype(self, dtype, copy, errors, values, **kwargs)
    631 
    632                     # _astype_nansafe works fine with 1-d only
--> 633                     values = astype_nansafe(values.ravel(), dtype, copy=True)
    634 
    635                 # TODO(extension)

C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\dtypes\cast.pyc in astype_nansafe(arr, dtype, copy, skipna)
    700     if copy or is_object_dtype(arr) or is_object_dtype(dtype):
    701         # Explicit copy, or required since NumPy can't view from / to object.
--> 702         return arr.astype(dtype, copy=True)
    703 
    704     return arr.view(dtype)

ValueError: could not convert string to float: XXX

Answer 1

我在代码的第六行中看到您正在尝试替换数据帧中的某些集合（XXX，YYY，..至0.55、0.52，..）。但最终您会提供字典 {0:55，1:52，..}其中键实际上是数组索引。

我更改了工作表2标头，以便更轻松地建立索引，例如

0 / 1 XXX / 0.55 YYY / 0.52 ZZZ / 0.35

并通过将第4行替换为

，使用现有的第0列设置索引

dictionary = df2.set_index(0)[1].to_dict()

和您的第6行，

c = df.filter(like ='B').replace(dictionary).astype(float).values

这提供了适当的字典来替换数据框。

值错误：无法将字符串转换为浮点型

1 个答案: