pandas - 需要整数类型

时间:2018-04-20 11:33:39

标签: pandas type-conversion

我的数据类型是float64,使用下面的代码从对象类型转换。 输入:

df.iloc[:,9:33]=df.iloc[:,9:33].apply(lambda x : x.str.extract('(\d+)',expand=False).astype(float))

当我试图求和时,我得到了以下错误:

输出:

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-41-5461a828af6e> in <module>()
----> 1 amount = df['Total'].sum()
   2 print(amount)

1 个答案:

答案 0 :(得分:1)

对我来说,它看起来像是错误,可能的解决方案不会返回,而是使用concat

#sample data
np.random.seed(123)
df = pd.DataFrame(np.random.choice(['aa45', 's789'], size=(5, 20)))
print(df)
     0     1     2     3     4     5     6     7     8     9     10    11  \
0  aa45  s789  aa45  aa45  aa45  aa45  aa45  s789  s789  aa45  s789  s789   
1  aa45  aa45  s789  s789  s789  aa45  s789  aa45  aa45  aa45  aa45  s789   
2  aa45  s789  aa45  s789  s789  s789  aa45  aa45  aa45  aa45  s789  s789   
3  aa45  s789  aa45  s789  aa45  aa45  aa45  aa45  s789  aa45  aa45  s789   
4  s789  aa45  aa45  aa45  aa45  s789  aa45  s789  s789  aa45  s789  s789   

     12    13    14    15    16    17    18    19  
0  aa45  s789  aa45  s789  aa45  s789  s789  aa45  
1  s789  s789  aa45  aa45  s789  aa45  aa45  s789  
2  aa45  aa45  s789  aa45  s789  aa45  aa45  s789  
3  s789  s789  s789  aa45  aa45  aa45  aa45  s789  
4  s789  aa45  s789  s789  s789  aa45  aa45  aa45  

尝试使用您的解决方案,但如果分配后的列转换为object s(string s):

df.iloc[:,9:13]=df.iloc[:,9:13].apply(lambda x : x.str.extract('(\d+)',expand=False).astype(float))
print (df)
     0     1     2     3     4     5     6     7     8   9    10   11   12  \
0  aa45  s789  aa45  aa45  aa45  aa45  aa45  s789  s789  45  789  789   45   
1  aa45  aa45  s789  s789  s789  aa45  s789  aa45  aa45  45   45  789  789   
2  aa45  s789  aa45  s789  s789  s789  aa45  aa45  aa45  45  789  789   45   
3  aa45  s789  aa45  s789  aa45  aa45  aa45  aa45  s789  45   45  789  789   
4  s789  aa45  aa45  aa45  aa45  s789  aa45  s789  s789  45  789  789  789   

     13    14    15    16    17    18    19  
0  s789  aa45  s789  aa45  s789  s789  aa45  
1  s789  aa45  aa45  s789  aa45  aa45  s789  
2  aa45  s789  aa45  s789  aa45  aa45  s789  
3  s789  s789  aa45  aa45  aa45  aa45  s789  
4  aa45  s789  s789  s789  aa45  aa45  aa45  
print (df.dtypes)
0     object
1     object
2     object
3     object
4     object
5     object
6     object
7     object
8     object
9     object
10    object
11    object
12    object
13    object
14    object
15    object
16    object
17    object
18    object
19    object
dtype: object

可能的解决方案是过滤第一列,转换列和最后一列,并通过concat连接在一起:

a = df.iloc[:, :9]
#in real data change 13 to 33
b = df.iloc[:,9:13].apply(lambda x : x.str.extract('(\d+)',expand=False).astype(float))
#in real data change 13 to 33
c = df.iloc[:, 13:]

df = pd.concat([a,b,c], axis=1)
print (df)
     0     1     2     3     4     5     6     7     8     9      10     11  \
0  aa45  s789  aa45  aa45  aa45  aa45  aa45  s789  s789  45.0  789.0  789.0   
1  aa45  aa45  s789  s789  s789  aa45  s789  aa45  aa45  45.0   45.0  789.0   
2  aa45  s789  aa45  s789  s789  s789  aa45  aa45  aa45  45.0  789.0  789.0   
3  aa45  s789  aa45  s789  aa45  aa45  aa45  aa45  s789  45.0   45.0  789.0   
4  s789  aa45  aa45  aa45  aa45  s789  aa45  s789  s789  45.0  789.0  789.0   

      12    13    14    15    16    17    18    19  
0   45.0  s789  aa45  s789  aa45  s789  s789  aa45  
1  789.0  s789  aa45  aa45  s789  aa45  aa45  s789  
2   45.0  aa45  s789  aa45  s789  aa45  aa45  s789  
3  789.0  s789  s789  aa45  aa45  aa45  aa45  s789  
4  789.0  aa45  s789  s789  s789  aa45  aa45  aa45  
print (df.dtypes)
0      object
1      object
2      object
3      object
4      object
5      object
6      object
7      object
8      object
9     float64
10    float64
11    float64
12    float64
13     object
14     object
15     object
16     object
17     object
18     object
19     object
dtype: object