我仍然试图解决this问题几天没有得到回应。这看起来有点乱,所以让我重写它来表明我想要的东西。
到目前为止,我认为它与从CSV加载文件有关,我做了另一个测试。从csv加载的代码不起作用,但是以数组格式编写它的工作原理。 这是代码:
import pandas as pd
import numpy as np
data = pd.read_csv('dffile.csv', index_col=0)
df=data[['AreaNo.','ID']]
is_even = df['ID'].str.extract('([0-9]+).*').astype(int) % 2 == 0
Even=Exset[is_even]
Odd=Exset[~is_even]
print (Even)
print (Odd)
这就是csv中的内容:
print (df)
print (data)
>>>
AreaNo. ID
Data
1 25th 676
2 3rd 378
3 California 4740
4 Geary 3445
5 Turk 2801A
6 California 4726
7 Idaho 6239B
8 Idaho 6239.5
9 27th 558
10 29th 584
11 27th 557
12 21st 571 1/2
13 30th 524
14 27th 524
15 Alaska 258
16 Alaska 740
17 27th 645
18 27th 684
[18 rows x 2 columns]
Unit Abbr Transaction AreaNo. ID
Data
1 HSBC 1 25th 676
2 Wells NaN 3rd 378
3 Apt NaN California 4740
4 MTE 204 Geary 3445
5 FPC 202 Turk 2801A
6 HSBC NaN California 4726
7 Wells 5 Idaho 6239B
8 Apt 3 Idaho 6239.5
9 ETF NaN 27th 558
10 BAC NaN 29th 584
11 Wells NaN 27th 557
12 Apt NaN 21st 571 1/2
13 ETF G1 30th 524
14 Wells NaN 27th 524
15 Wells NaN Alaska 258
16 Apt NaN Alaska 740
17 ETF NaN 27th 645
18 Bac 3 27th 684
[18 rows x 4 columns]
>>> df.dtypes
AreaNo. object
ID object
dtype: object
以下是df.ID.str.extract('([0-9]+).*')
>>> df.ID.str.extract('([0-9]+).*')
Data
1 378
2 4740
3 3445
4 2801
5 4726
6 6239
7 6239
8 558
9 584
10 557
11 571
12 524
13 524
14 258
15 740
16 645
17 684
18 NaN
Name: ID, dtype: object
以下是解释器的错误
Traceback (most recent call last):
File "<string>", line 420, in run_nodebug
File "C:\Users\0\Desktop\python\performance.py", line 16, in <module>
is_even = df['ID'].str.extract('([0-9]+).*').astype(int) % 2 == 0
File "C:\Python33\lib\site-packages\pandas\core\generic.py", line 2018, in astype
dtype, copy=copy, raise_on_error=raise_on_error)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 2416, in astype
return self.apply('astype', *args, **kwargs)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 2375, in apply
applied = getattr(blk, f)(*args, **kwargs)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 427, in astype
values=values)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 444, in _astype
values = com._astype_nansafe(self.values, dtype, copy=True)
File "C:\Python33\lib\site-packages\pandas\core\common.py", line 2222, in _astype_nansafe
return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
File "lib.pyx", line 733, in pandas.lib.astype_intsafe (pandas\lib.c:12697)
File "util.pxd", line 59, in util.set_value_at (pandas\lib.c:49357)
ValueError: cannot convert float NaN to integer
以下是我之前编写的代码
import pandas as pd
df=pd.DataFrame({'ID': ['10A','6.5', '4 A', '3 1/2'], 'Name': ['J','K','L','M']})
def ExtractU(df):
is_even = df['ID'].str.extract('(\d+).*').astype(int) % 2 == 0
Even=df[is_even]
Odd=df[~is_even]
return Even
print (ExtractU(df))
ID Name
0 10A J
1 6.5 K
2 4 A L
[3 rows x 2 columns]
>>> df.dtypes
ID object
Name object
dtype: object
在加载数据时我做错了什么?为什么不起作用?测试了它们相同的数据类型。如何修复代码以使csv工作?这发生在我切换到python 3时。