Question

在我对Numba中的这个错误的研究中，我还没有看到这种特殊情况。这是我第一次使用这个包，所以它可能是显而易见的。

我有一个函数可以通过在称为数据的数据框中添加，相乘和/或划分每一列来计算数据集中的工程特征，我想测试numba是否会加速它

@jit
def engineer_features(engineer_type,features,joined):
    #choose which features to engineer (must be > 1)
    engineered = features

    if len(engineered) > 1:
        if 'Square' in engineer_type:
            sq = data[features].apply(np.square)
            sq.columns = map(lambda s:s + '_^2',features)

        for c1,c2 in combinations(engineered,2):
            if 'Add' in engineer_type:
                data['{0}+{1}'.format(c1,c2)] = data[c1] + data[c2]
            if 'Multiply' in engineer_type:
                data['{0}*{1}'.format(c1,c2)] = data[c1] * data[c2]
            if 'Divide' in engineer_type:
                data['{0}/{1}'.format(c1,c2)] = data[c1] / data[c2]

        if 'Square' in engineer_type and len(sq) > 0:
            data= pd.merge(data,sq,left_index=True,right_index=True)

        return data

当我使用功能列表，engineer_type和数据集调用它时：

engineer_type = ['Square','Add','Multiply','Divide']   

df = engineer_features(engineer_type,features,joined)

我收到错误：对象失败（分析字节码）＆＃39; DataFlowAnalysis＆＃39;对象没有属性＆＃39; op_MAKE_FUNCTION＆＃39;

Answer 1

这里也有同样的问题。我认为问题可能是numba does not support function creation.

以来的lambda函数

Answer 2

我有同样的错误。 Numba不支持大熊猫。我将pandas df中的重要列转换为一堆数组，并在@JIT下成功运行。此外，数组比pandas df快得多，因为你需要它来处理大数据。

＆＃39; DataFlowAnalysis＆＃39;对象没有属性＆＃39; op_MAKE_FUNCTION＆＃39;在Numba

2 个答案: