如果由特定函数

时间:2016-03-22 21:19:56

标签: python python-2.7 exception-handling scikit-learn

问题:

sklearn允许创建用户定义的距离函数,用于多种算法(例如KNN)。但是,它通过在页面末尾创建a random numpy array__init__ class PyFuncDistance(DistanceMetric)来测试用户定义的函数。我的函数是为分类变量定义的,并且为了加快计算,我将字典传递给我提前构建的距离函数。当然,当sklearn通过float数组进行测试时,它会引发KeyError,因为字典只有属性值作为键。

代码:

import pandas as pd
import numpy as np
from sklearn import preprocessing
from sklearn.neighbors import KNeighborsClassifier
from sklearn import cross_validation

df = pd.DataFrame(np.random.choice(["a", "b", "c", "d"], (200, 4)))   

for col in df:
    le = preprocessing.LabelEncoder()
    le.fit(df[col])
    df[col] = le.transform(df[col])

value_dict = df[0].value_counts().to_dict()

def custom_distance(point1, point2, value_dict):
    #this is not the actual distance function, just a simplified version for reproducibility
    distance = .0
    for i in range(1, len(point1)+1):
        distance += abs(value_dict[point1[i]] - value_dict[point2[i]])
    return distance

neigh_custom = KNeighborsClassifier(n_neighbors=10, metric=custom_distance, 
                        metric_params = {"value_dict": value_dict})

scores = cross_validation.cross_val_score(neigh_custom, df.ix[:,1:], df.ix[:,0], cv=10)

问题:

为了确保错误不是由原始数据引起的,而是由测试引起的,只有__init__ PyFuncDistance aa -- bb -- cc -- dd 引发该异常,才能捕获该异常吗?目前我正在检查数字是否在0到1之间,以了解它是否是随机生成的,但我不认为这是一个好习惯。

1 个答案:

答案 0 :(得分:1)

import traceback
import sys


try:
    scores = cross_validation.cross_val_score(neigh_custom, df.ix[:,1:], df.ix[:,0], cv=10)
except Exception, err:
    exc_type, exc_value, exc_traceback = sys.exc_info()
    sam =  traceback.format_exception(exc_type, exc_value,
                                          exc_traceback)
    if 'PyFuncDistance.__init__' in sam[-3]:
        print 'I knew it'

如果您想针对其他问题提出异常,可以使用' raise'并使用sam打印回溯调用问题

希望这有帮助!