达斯克(Dask):数据框to_dask_array不起作用

时间:2019-09-11 18:19:27

标签: python pandas dataframe dask dask-distributed

我正在尝试从dask数据帧创建dask数组,如下所示:


input.csv

A,B,C

5、7、8

9、6、11

15、3、2

0,1,5

2、4、3


df1 = dd.read_csv('input.csv')
print(type(df1))
print(df1.head())
arr = df1.values # working good
print(type(arr))
print(arr.compute())

arr2 = df1.to_dask_array(lengths=True) # (Not working) I want to Specify lengths=True which can triggers immediate computation of the chunk sizes.
print(type(arr2))

输出:

C:\Anaconda\python.exe C:/App.py
<class 'dask.dataframe.core.DataFrame'>
    A   B   C
0   5   7   8
1   9   6  11
2  15   3   2
3   0   1   5
4   2   4   3
<class 'dask.array.core.Array'>
[[ 5  7  8]
 [ 9  6 11]
 [15  3  2]
 [ 0  1  5]
 [ 2  4  3]]
Traceback (most recent call last):
  File "C:/App.py", line 68, in <module>
    arr2 = df1.to_dask_array(lengths=True)
  File "C:\Anaconda\lib\site-packages\dask\dataframe\core.py", line 2313, in __getattr__
raise AttributeError("'DataFrame' object has no attribute %r" % key)
AttributeError: 'DataFrame' object has no attribute 'to_dask_array'

0 个答案:

没有答案