graphlab创建sframe如何获得SArray中位数

时间:2016-07-15 11:47:19

标签: python pandas machine-learning data-analysis graphlab

我正在学习graphlab创建 与

data=graphlab.SFrame.read_csv('test.csv')

我试图获得一列的中位数

data_train.fillna(('Credit_History',data_train['Credit_History'].median()))

但我收到了错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-247-50ed3eb09dcc> in <module>()
----> 1 data_train.fillna(('Credit_History',data_train['Credit_History'].median()))

AttributeError: 'SArray' object has no attribute 'median'

data.show()将显示此列的中位数 有谁知道如何解决这个问题?

2 个答案:

答案 0 :(得分:4)

我想我明白你想做什么。 Sframe没有默认的中值函数。我会这样即兴发挥:

import numpy as np
data_train.fillna('Credit_History', np.median(data_train['Credit_History']))

答案 1 :(得分:1)

SArray没有中位数法。获得中位数的最佳方法是通过sketch_summary方法,然后是quantile。有关

草图摘要的更多信息

https://turi.com/products/create/docs/generated/graphlab.Sketch.html

import numpy as np
import graphlab as gl

sf = gl.SFrame(np.random.rand(100))

sketch = sf['X1'].sketch_summary()
median = sketch.quantile(0.5)