熊猫数据框。将元组值扩展为具有多索引的列

时间:2019-10-15 14:37:43

标签: python pandas multi-index

我有一个数据框df

                A    B
first second          
bar   one     0.0  0.0
      two     0.0  0.0
foo   one     0.0  0.0
      two     0.0  0.0

我将其转换为另一个值是元组的

                      A          B
first second                      
bar   one     (6, 1, 0)  (0, 9, 3)
      two     (9, 3, 4)  (6, 2, 1)
foo   one     (1, 9, 0)  (4, 0, 0)
      two     (6, 1, 5)  (8, 3, 5)

我的问题是如何使它(expanded)像下面这样,其中元组值变成具有多索引的列?我可以在transform期间执行此操作吗?还是应该在transform之后执行此附加步骤?

                   A       B
               m n k   m n k            
first second   
bar   one      6 1 0   0 9 3
      two      9 3 4   6 2 1
foo   one      1 9 0   4 0 0
      two      6 1 5   8 3 5

以上代码:

import numpy as np
import pandas as pd

np.random.seed(123)


def expand(s):
    # complex logic of `result` has been replaced with `np.random`
    result = [tuple(np.random.randint(10, size=3)) for i in s]
    return result


index = pd.MultiIndex.from_product([['bar', 'foo'], ['one', 'two']], names=['first', 'second'])
df = pd.DataFrame(np.zeros((4, 2)), index=index, columns=['A', 'B'])
print(df)

expanded = df.groupby(['second']).transform(expand)
print(expanded)

2 个答案:

答案 0 :(得分:1)

尝试一下:

df_lst = []
for col in df.columns:
    expanded_splt = expanded.apply(lambda x: pd.Series(x[col]),axis=1)
    columns  = pd.MultiIndex.from_product([[col], ['m', 'n', 'k']])
    expanded_splt.columns = columns
    df_lst.append(expanded_splt)
pd.concat(df_lst, axis=1)

输出:

                A           B
                m   n   k   m   n   k
first   second                      
bar     one     6   1   0   0   9   3
        two     9   3   4   6   2   1
foo     one     1   9   0   4   0   0
        two     6   1   5   8   3   5

答案 1 :(得分:0)

最后,我有时间找到适合我的答案。

expanded_data = expanded.agg(lambda x: np.concatenate(x), axis=1).to_numpy()
expanded_data = np.stack(expanded_data)
column_index = pd.MultiIndex.from_product([expanded.columns, ['m', 'n', 'k']])
exploded = pd.DataFrame(expanded_data, index=expanded.index, columns=column_index)
print(exploded)
              A        B      
              m  n  k  m  n  k
first second                  
bar   one     6  1  0  0  9  3
      two     9  3  4  6  2  1
foo   one     1  9  0  4  0  0
      two     6  1  5  8  3  5