Graphlab - OverflowError:太大而无法转换

时间:2016-04-23 12:56:49

标签: python graphlab sframe

我通过传递相同教学特征的不同力量来创建不同的多项式回归模型。

所以如果我想要一个特征'x'的3阶多项式模型。然后到回归模型,我传递x ^ 1,x ^ 2和x ^ 3作为特征。

以下函数用于创建权限为'x'的Sframe表。从传递给它的'x'的值,以及需要创建的度数幂。

def polynomial_sframe(feature, degree):

# assume that degree >= 1
# initialize the SFrame:
poly_sframe = graphlab.SFrame()

#poly_sframe['power_1'] equal to the passed feature
poly_sframe['power_1'] = feature

# first check if degree > 1
if degree > 1:

    # then loop over the remaining degrees:
    # range usually starts at 0 and stops at the endpoint-1. 
    for power in range(2, degree+1): 

        #give the column a name:
        name = 'power_' + str(power)

        # then assign poly_sframe[name] to the appropriate power of feature
        poly_sframe[name] = feature.apply(lambda x: x**power)

return poly_sframe

然后使用从上面的函数生成的Sframe。我能够为不同的X度生成不同的多项式表达式。如下面的代码所示。

poly3_data = polynomial_sframe(sales['sqft_living'], 3)

my_features = poly3_data.column_names() # get the name of the features

poly3_data['price'] = sales['price'] # add price to the data since it's the target

model3 = graphlab.linear_regression.create(poly3_data, target = 'price', features = my_features, validation_set = None)

Graphlab能够生成高达4级的模型。之后如果失败并生成以下代码。它将显示发生溢出错误。

poly15_data = polynomial_sframe(sales['sqft_living'], 5)

my_features = poly15_data.column_names() # get the name of the features

poly15_data['price'] = sales['price'] # add price to the data since it's the target

model15 = graphlab.linear_regression.create(poly15_data, target = 'price', features = my_features, validation_set = None)

---------------------------------------------------------------------------
ToolkitError                              Traceback (most recent call last)
<ipython-input-76-df5cbc0b6314> in <module>()
      2 my_features = poly15_data.column_names() # get the name of the features
      3 poly15_data['price'] = sales['price'] # add price to the data since it's the target
----> 4 model15 = graphlab.linear_regression.create(poly15_data, target = 'price', features = my_features, validation_set = None)

C:\Users\mk\Anaconda2\envs\dato-env\lib\site-  
packages\graphlab\toolkits\regression\linear_regression.pyc in create(dataset, target, features, l2_penalty, l1_penalty, solver, feature_rescaling, convergence_threshold, step_size, lbfgs_memory_level, max_iterations, validation_set, verbose)
    284                         step_size = step_size,
    285                         lbfgs_memory_level = lbfgs_memory_level,
--> 286                         max_iterations = max_iterations)
    287 
    288     return LinearRegression(model.__proxy__)

C:\Users\mk\Anaconda2\envs\dato-env\lib\site- 
packages\graphlab\toolkits\_supervised_learning.pyc in create(dataset, target, model_name, features, validation_set, verbose, distributed, **kwargs)
    451     else:
    452         ret = _graphlab.toolkits._main.run("supervised_learning_train",
--> 453                                            options, verbose)
    454         model = SupervisedLearningModel(ret['model'], model_name)
    455 

C:\Users\mk\Anaconda2\envs\dato-env\lib\site-
packages\graphlab\toolkits\_main.pyc in run(toolkit_name, options, verbose, show_progress)
     87         _get_metric_tracker().track(metric_name, value=1, properties=track_props, send_sys_info=False)
     88 
---> 89         raise ToolkitError(str(message))

ToolkitError: Exception in python callback function evaluation: 
OverflowError('long too big to convert',): 
Traceback (most recent call last):
File "graphlab\cython\cy_pylambda_workers.pyx", line 426, in graphlab.cython.cy_pylambda_workers._eval_lambda
File "graphlab\cython\cy_pylambda_workers.pyx", line 171, in graphlab.cython.cy_pylambda_workers.lambda_evaluator.eval_simple
File "graphlab\cython\cy_flexible_type.pyx", line 1193, in graphlab.cython.cy_flexible_type.process_common_typed_list
File "graphlab\cython\cy_flexible_type.pyx", line 1138, in graphlab.cython.cy_flexible_type._fill_typed_sequence
File "graphlab\cython\cy_flexible_type.pyx", line 1385, in graphlab.cython.cy_flexible_type._ft_translate
OverflowError: long too big to convert

这是错误,因为我的计算机缺乏内存来计算回归模型吗?如何解决这个错误?

1 个答案:

答案 0 :(得分:0)

好像你在这一行的末尾有一个拼写错误: poly15_data = polynomial_sframe(sales ['sqft_living'],5)

将5更改为15,它应该可以正常工作。