Question

我有8列数据框，想要另一个有2列的数据框。这两列中的值是根据原始的8个值计算的。

是否可以使用apply或transform？

示例：

jnd = pd.DataFrame(np.random.rand(18, 8))

def appl(s):
    """particular processing is not important, only shapes matter.
       Therefore just randomly select 2 of passed values"""
    return np.random.choice(s, size=2)

jnd.apply(appl, axis=1)

这引发了

ValueError: Shape of passed values is (18, 2), indices imply (18, 8)

transform也是如此。

Answer 1

您可以使用创建新列名称的索引将输出转换为Series：

def appl(s):
    """particular processing is not important, only shapes matter.
       Therefore just randomly select 2 of passed values"""
    return pd.Series(np.random.choice(s, size=2), index=['a','b'])

print(jnd.apply(appl, axis=1))
           a         b
0   0.095437  0.256290
1   0.251450  0.072835
2   0.755617  0.630932
3   0.667163  0.449646
4   0.581908  0.341653
5   0.767170  0.376034
6   0.226523  0.120946
7   0.537986  0.385240
8   0.727680  0.998355
9   0.727728  0.308487
10  0.808792  0.286342
11  0.481634  0.767650
12  0.540303  0.106239
13  0.976599  0.640354
14  0.062515  0.062515
15  0.892971  0.856905
16  0.111959  0.526366
17  0.344646  0.268620

ValueError：传递值的形状为X，index表示pandas中的Y应用和变换

1 个答案: