Question

我想遍历行，并合并所有保留原始行信息的结果数据框。我有一个可行的例子：

MWE：

import pandas as pd
df = pd.DataFrame({'a': list(range(3)), 'b': list(range(3))})

pd.concat(df.apply(lambda row: (
    pd.DataFrame(pd.np.zeros((row.a + row.b + 1, 2)), columns=['c', 'd']).assign(**row)
), axis=1).values).reset_index(drop=True)
     c    d  a  b
0  0.0  0.0  0  0
1  0.0  0.0  1  1
2  0.0  0.0  1  1
3  0.0  0.0  1  1
4  0.0  0.0  2  2
5  0.0  0.0  2  2
6  0.0  0.0  2  2
7  0.0  0.0  2  2
8  0.0  0.0  2  2

但是我觉得这很hacky。我猜想有一种 direct 方式可以合并从apply获得的所有结果（就像R中一样）。我不喜欢的东西：

使用**row添加初始值
使用基础的numpy数组来使用pd.concat
reset_index，因为最终索引是从循环中创建的新数据帧而不是原始索引中获取的。

Answer 1

我找不到重复项。但是IIUC，您正在尝试在两个数据帧上执行crosstab：

df = pd.DataFrame({'a': list(range(3)), 'b': list(range(3))})
df2 = pd.DataFrame([[1,2],[3,4]], columns=('c','d'))

pd.concat((df2.loc[np.tile(df2.index, len(df))].reset_index(drop=True),
           df.loc[df.index.repeat(len(df2))].reset_index(drop=True)),
          axis=1, ignore_index=True)

输出：

    0   1   2   3
0   1   2   0   0
1   3   4   0   0
2   1   2   1   1
3   3   4   1   1
4   1   2   2   2
5   3   4   2   2

或类似地：

common_idx = pd.MultiIndex.from_product((df.index, df2.index))

out1 = df.reindex(common_idx.get_level_values(0)).set_index(common_idx)

out2 = df2.reindex(common_idx.get_level_values(1)).set_index(common_idx)

pd.concat((out2,out1),axis=1).reset_index(drop=True)

输出：

   c  d  a  b
0  1  2  0  0
1  3  4  0  0
2  1  2  1  1
3  3  4  1  1
4  1  2  2  2
5  3  4  2  2

大熊猫遍历行并自动生成结果？

1 个答案: