将numpy对象类型转换为浮点类型

时间:2020-01-22 04:54:16

标签: python pandas numpy

df.sample(3).values[:,1:].astype('float64')
>> array([[  1.31199997e+02,   1.37149994e+02,   1.31199997e+02,
          1.36320007e+02,   1.17088593e+02,   6.15015000e+05],
       [  1.35199997e+02,   1.36570007e+02,   1.34330002e+02,
          1.35639999e+02,   1.16504501e+02,   3.52835000e+05],
       [  1.31419998e+02,   1.33500000e+02,   1.30759995e+02,
          1.31779999e+02,   1.13189064e+02,   2.09805000e+05]])

我正在使用熊猫从csv文件读取数据,然后将数据转换为numpy.float64,但得到的指数值如1.31199997e+02,但预期输出应为正常数字,例如131.199997,而不是1.31199997e+02

我的代码:

df = pd.read_csv('data.csv')                # reading csv
df.dtypes
>> 
Date          object
Open         float64
High         float64
Low          float64
Close        float64
Adj Close    float64
Volume         int64
dtype: object

a = df.sample(3).values[:,1:]        # get array using `dataframe.values`
a
>> array([[131.199997, 137.149994, 131.199997, 136.320007, 117.08859299999999,
        615015],
       [135.199997, 136.570007, 134.330002, 135.639999, 116.504501, 352835],
       [131.419998, 133.5, 130.759995, 131.779999, 113.18906399999999,
        209805]], dtype=object)

a = a.astype('float64')                # converting to `float64`
a
>> array([[  1.31199997e+02,   1.37149994e+02,   1.31199997e+02,
          1.36320007e+02,   1.17088593e+02,   6.15015000e+05],
       [  1.35199997e+02,   1.36570007e+02,   1.34330002e+02,
          1.35639999e+02,   1.16504501e+02,   3.52835000e+05],
       [  1.31419998e+02,   1.33500000e+02,   1.30759995e+02,
          1.31779999e+02,   1.13189064e+02,   2.09805000e+05]])

data.csv

Date,Open,High,Low,Close,Adj Close,Volume
2013-05-08,135.199997,136.570007,134.330002,135.639999,116.504501,352835
2013-05-09,135.800003,138.940002,135.199997,136.259995,117.037041,952515
2013-05-10,136.199997,138.199997,135.009995,135.389999,116.289780,444045
2013-05-13,135.000000,136.000000,131.639999,132.539993,113.841843,260395
2013-05-14,131.419998,133.500000,130.759995,131.779999,113.189064,209805
2013-05-15,131.199997,137.149994,131.199997,136.320007,117.088593,615015

2 个答案:

答案 0 :(得分:2)

131.1999971.31199997e+02是相同编号的等效显示。它们都是“普通浮动”。

在:

array([[131.199997, 137.149994, 131.199997, 136.320007, 117.08859299999999,
        615015],
       [135.199997, 136.570007, 134.330002, 135.639999, 116.504501, 352835],
       [131.419998, 133.5, 130.759995, 131.779999, 113.18906399999999,
        209805]], dtype=object)

每个元素都是Python浮点数,并且单独设置格式,而不管其值如何。请注意,有些字符串很长,有些则很短。

在:

a = a.astype('float64')                # converting to `float64`
a
array([[  1.31199997e+02,   1.37149994e+02,   1.31199997e+02,
          1.36320007e+02,   1.17088593e+02,   6.15015000e+05],
       [  1.35199997e+02,   1.36570007e+02,   1.34330002e+02,
          1.35639999e+02,   1.16504501e+02,   3.52835000e+05],
       [  1.31419998e+02,   1.33500000e+02,   1.30759995e+02,
          1.31779999e+02,   1.13189064e+02,   2.09805000e+05]])

该数组作为一个整体显示,使用的格式对于较小的值(1e2100)和较大的值(1e5,{{1 }}。这种格式使用整洁的列,显示2d数组结构。

尽管您可以控制100000如何显示这样的数组,但它不会更改基础数字值。对于快速的numpy计算,您需要此数字numpy,而不是dtype

尝试object。那应该只是df.sample(3).values[:,1:-1]周围的浮点值。它是具有100之类的值的最后一个整数列,可触发科学计数法。

更好的是,在应用209805之前,从数据框中选择“打开,高,低,关闭,调整关闭”列。这些都是.values dtype,结果数组也将具有该dtype。分别选择整数float64列。您已经在单独处理字符串/对象volume列。

尝试:

date

答案 1 :(得分:1)

尝试添加:

np.set_printoptions(suppress=True)

作为import numpy as np下的第一行。