Question

我没有使用Joblib获得预期的输出格式（字典）。也许是因为我不知道如何编写代码来从Joblib中受益。

当我没有Joblib时，我会得到期望的结果。但是使用Joblib，我无法获得正确格式的结果。

备注：由于数据帧很大，我使用Elizabeth Santorella的Groupby类（比pandas groupby快（http://esantorella.com/2016/06/16/groupby/）。

预先感谢您的帮助。

最诚挚的问候

    # Point 1 (without joblib) works
    grouped = Groupby(df.index)
    d = {col: grouped.apply(sum, df[col], broadcast=False) for col in col1}

    #Point 1: OK (and it is what is expected)
    {'gp_0': array([0., 0., 0., ..., 0., 0., 0.]),
    'gp_1': array([0., 0., 0., ..., 0., 0., 0.]),
    'gp_2': array([0., 0., 0., ..., 0., 0., 0.])}

    # Point 2 doesn't work (output format list of arrays)
    from joblib import Parallel, delayed
    d = {}
    grouped = Groupby(new.index)
    d = Parallel(n_jobs=1, verbose=10)(delayed(grouped.apply)(sum, new[col], broadcast=False) for col in col1)

    # Point2: Not OK
    [array([0., 0., 0., ..., 0., 0., 0.]),
    array([0., 0., 0., ..., 0., 0., 0.]),
    array([0., 0., 0., ..., 0., 0., 0.])]

    # Point 3 doesn't work (output format list of dictionaries)
    d = {}
    grouped = Groupby(new.index)
    def my_func(col):
        return {col: grouped.apply(sum, new[col], broadcast=False)}
    d = Parallel(n_jobs=1, verbose=10)(delayed(my_func)(col) for col in col1)

    #Point 3: not OK
    [{'gp_0': array([0., 0., 0., ..., 0., 0., 0.])},
    {'gp_1': array([0., 0., 0., ..., 0., 0., 0.])},
    {'gp_2': array([0., 0., 0., ..., 0., 0., 0.])}]

使用Joblib无法获得正确的输出格式

0 个答案: