Pandas ValueError将float NaN转换为整数I / O csv文件

时间:2014-02-08 04:22:22

标签: python csv io pandas

我仍然试图解决this问题几天没有得到回应。这看起来有点乱,所以让我重写它来表明我想要的东西。

到目前为止,我认为它与从CSV加载文件有关,我做了另一个测试。从csv加载的代码不起作用,但是以数组格式编写它的工作原理。 这是代码:

import pandas as pd
import numpy as np

data = pd.read_csv('dffile.csv', index_col=0)
df=data[['AreaNo.','ID']]
is_even = df['ID'].str.extract('([0-9]+).*').astype(int) % 2 == 0
Even=Exset[is_even]
Odd=Exset[~is_even]
print (Even)
print (Odd)

这就是csv中的内容:

print (df)
print (data)

>>> 
         AreaNo.       ID
Data                     
1           25th      676
2            3rd      378
3     California     4740
4          Geary     3445
5           Turk    2801A
6     California     4726
7          Idaho    6239B
8          Idaho   6239.5
9           27th      558
10          29th      584
11          27th      557
12          21st  571 1/2
13          30th      524
14          27th      524
15        Alaska      258
16        Alaska      740
17          27th      645
18          27th      684

[18 rows x 2 columns]
     Unit Abbr Transaction     AreaNo.       ID
Data                                           
1         HSBC           1        25th      676
2        Wells         NaN         3rd      378
3     Apt              NaN  California     4740
4          MTE         204       Geary     3445
5          FPC         202        Turk    2801A
6         HSBC         NaN  California     4726
7        Wells           5       Idaho    6239B
8     Apt                3       Idaho   6239.5
9          ETF         NaN        27th      558
10         BAC         NaN        29th      584
11       Wells         NaN        27th      557
12    Apt              NaN        21st  571 1/2
13         ETF    G1              30th      524
14       Wells         NaN        27th      524
15       Wells         NaN      Alaska      258
16    Apt              NaN      Alaska      740
17         ETF         NaN        27th      645
18         Bac           3        27th      684

[18 rows x 4 columns]

>>> df.dtypes
AreaNo.    object
ID         object
dtype: object

以下是df.ID.str.extract('([0-9]+).*')

>>> df.ID.str.extract('([0-9]+).*')
Data
1        378
2       4740
3       3445
4       2801
5       4726
6       6239
7       6239
8        558
9        584
10       557
11       571
12       524
13       524
14       258
15       740
16       645
17       684
18       NaN
Name: ID, dtype: object

以下是解释器的错误

Traceback (most recent call last):
  File "<string>", line 420, in run_nodebug
  File "C:\Users\0\Desktop\python\performance.py", line 16, in <module>
    is_even = df['ID'].str.extract('([0-9]+).*').astype(int) % 2 == 0
  File "C:\Python33\lib\site-packages\pandas\core\generic.py", line 2018, in astype
    dtype, copy=copy, raise_on_error=raise_on_error)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 2416, in astype
    return self.apply('astype', *args, **kwargs)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 2375, in apply
    applied = getattr(blk, f)(*args, **kwargs)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 427, in astype
    values=values)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 444, in _astype
    values = com._astype_nansafe(self.values, dtype, copy=True)
  File "C:\Python33\lib\site-packages\pandas\core\common.py", line 2222, in _astype_nansafe
    return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
  File "lib.pyx", line 733, in pandas.lib.astype_intsafe (pandas\lib.c:12697)
  File "util.pxd", line 59, in util.set_value_at (pandas\lib.c:49357)
ValueError: cannot convert float NaN to integer

以下是我之前编写的代码

import pandas as pd

df=pd.DataFrame({'ID': ['10A','6.5', '4 A', '3 1/2'], 'Name': ['J','K','L','M']})


def ExtractU(df):
    is_even = df['ID'].str.extract('(\d+).*').astype(int) % 2 == 0
    Even=df[is_even]
    Odd=df[~is_even]
    return Even

print (ExtractU(df))

    ID Name
0  10A    J
1  6.5    K
2  4 A    L

[3 rows x 2 columns]

>>> df.dtypes
ID      object
Name    object
dtype: object

在加载数据时我做错了什么?为什么不起作用?测试了它们相同的数据类型。如何修复代码以使csv工作?这发生在我切换到python 3时。

0 个答案:

没有答案