df.sample(3).values[:,1:].astype('float64')
>> array([[ 1.31199997e+02, 1.37149994e+02, 1.31199997e+02,
1.36320007e+02, 1.17088593e+02, 6.15015000e+05],
[ 1.35199997e+02, 1.36570007e+02, 1.34330002e+02,
1.35639999e+02, 1.16504501e+02, 3.52835000e+05],
[ 1.31419998e+02, 1.33500000e+02, 1.30759995e+02,
1.31779999e+02, 1.13189064e+02, 2.09805000e+05]])
我正在使用熊猫从csv文件读取数据,然后将数据转换为numpy.float64
,但得到的指数值如1.31199997e+02
,但预期输出应为正常数字,例如131.199997
,而不是1.31199997e+02
我的代码:
df = pd.read_csv('data.csv') # reading csv
df.dtypes
>>
Date object
Open float64
High float64
Low float64
Close float64
Adj Close float64
Volume int64
dtype: object
a = df.sample(3).values[:,1:] # get array using `dataframe.values`
a
>> array([[131.199997, 137.149994, 131.199997, 136.320007, 117.08859299999999,
615015],
[135.199997, 136.570007, 134.330002, 135.639999, 116.504501, 352835],
[131.419998, 133.5, 130.759995, 131.779999, 113.18906399999999,
209805]], dtype=object)
a = a.astype('float64') # converting to `float64`
a
>> array([[ 1.31199997e+02, 1.37149994e+02, 1.31199997e+02,
1.36320007e+02, 1.17088593e+02, 6.15015000e+05],
[ 1.35199997e+02, 1.36570007e+02, 1.34330002e+02,
1.35639999e+02, 1.16504501e+02, 3.52835000e+05],
[ 1.31419998e+02, 1.33500000e+02, 1.30759995e+02,
1.31779999e+02, 1.13189064e+02, 2.09805000e+05]])
data.csv
Date,Open,High,Low,Close,Adj Close,Volume
2013-05-08,135.199997,136.570007,134.330002,135.639999,116.504501,352835
2013-05-09,135.800003,138.940002,135.199997,136.259995,117.037041,952515
2013-05-10,136.199997,138.199997,135.009995,135.389999,116.289780,444045
2013-05-13,135.000000,136.000000,131.639999,132.539993,113.841843,260395
2013-05-14,131.419998,133.500000,130.759995,131.779999,113.189064,209805
2013-05-15,131.199997,137.149994,131.199997,136.320007,117.088593,615015
答案 0 :(得分:2)
131.199997
,1.31199997e+02
是相同编号的等效显示。它们都是“普通浮动”。
在:
array([[131.199997, 137.149994, 131.199997, 136.320007, 117.08859299999999,
615015],
[135.199997, 136.570007, 134.330002, 135.639999, 116.504501, 352835],
[131.419998, 133.5, 130.759995, 131.779999, 113.18906399999999,
209805]], dtype=object)
每个元素都是Python浮点数,并且单独设置格式,而不管其值如何。请注意,有些字符串很长,有些则很短。
在:
a = a.astype('float64') # converting to `float64`
a
array([[ 1.31199997e+02, 1.37149994e+02, 1.31199997e+02,
1.36320007e+02, 1.17088593e+02, 6.15015000e+05],
[ 1.35199997e+02, 1.36570007e+02, 1.34330002e+02,
1.35639999e+02, 1.16504501e+02, 3.52835000e+05],
[ 1.31419998e+02, 1.33500000e+02, 1.30759995e+02,
1.31779999e+02, 1.13189064e+02, 2.09805000e+05]])
该数组作为一个整体显示,使用的格式对于较小的值(1e2
,100
)和较大的值(1e5
,{{1 }}。这种格式使用整洁的列,显示2d数组结构。
尽管您可以控制100000
如何显示这样的数组,但它不会更改基础数字值。对于快速的numpy
计算,您需要此数字numpy
,而不是dtype
。
尝试object
。那应该只是df.sample(3).values[:,1:-1]
周围的浮点值。它是具有100
之类的值的最后一个整数列,可触发科学计数法。
更好的是,在应用209805
之前,从数据框中选择“打开,高,低,关闭,调整关闭”列。这些都是.values
dtype,结果数组也将具有该dtype。分别选择整数float64
列。您已经在单独处理字符串/对象volume
列。
尝试:
date
答案 1 :(得分:1)
尝试添加:
np.set_printoptions(suppress=True)
作为import numpy as np
下的第一行。