跨索引的熊猫交叉表

时间:2020-05-19 14:06:00

标签: pandas multi-index crosstab

嗨,我正在尝试从多索引变量“ df”获取Crosstab:

df.tail()

code    X1  X2  X3
pays    USA USA USA
desc    phase   phase   phase
2020-01-01  a   a   a
2020-02-01  b   c   d
2020-03-01  a   a   b
2020-04-01  c   a   a
2020-05-01  d   a   d

我想得到类似的东西:

             X1                X2           X3
       a   b   c   d       a  b  c  d    a  b  c  d

    a
X1  b 
    c 
    d                          
    a
X2  b 
    c
    d
    a
X3  b
    c 
    d 

在每个单元格中我得到(a,b,c,d)的Xi数/ Xj值的百分比/

我尝试过:

pd.crosstab(index = df, columns = df) 

但是我收到一条错误消息:

ValueError: Shape of passed values is (3, 2), indices imply (605, 2)

感谢您的帮助

1 个答案:

答案 0 :(得分:0)

我没有找到使用pd.crosstab函数执行此操作的方法,但是可以通过双循环实现。我很乐意将此功能与有序(分类)类型一起使用,但是我的幼稚尝试(注释掉了)没有用。

import pandas as pd
import numpy as np

def full_crosstab(df, row_keys=None, col_keys=None):
    row_keys = row_keys or df.columns
    col_keys = col_keys or df.columns
    df_final = []
    for outer in row_keys:
        df_outer = []
        for inner in col_keys:
            df_inner = pd.crosstab(df[outer], df[inner])
            df_outer.append(df_inner)
        df_outer = pd.concat(df_outer, axis=1, keys=col_keys)
        df_final.append(df_outer)
    return pd.concat(df_final, keys=row_keys)


def category(values, size):
    series = np.random.choice(values, size=size)
    return pd.Series(series)
    #dtype = pd.CategoricalDtype(categories=values, ordered=True)
    #return pd.Series(series, dtype=dtype)

size = 100
mydf = pd.DataFrame(dict(
    age_range=category(['<18', '18-34', '35-64', '65+'], size=size),
    reg=category(['yes', 'no'], size=size),
    issue=category(['guns', 'schools', 'healthcare'], size=size),
))


df_ct = full_crosstab(mydf)
print(df_ct)
                     age_range               reg     issue                   
                         18-34 35-64 65+ <18  no yes  guns healthcare schools
age_range 18-34             22     0   0   0  14   8     8          7       7
          35-64              0    24   0   0  10  14    11         10       3
          65+                0     0  23   0  13  10     9          5       9
          <18                0     0   0  31  17  14     5         14      12
reg       no                14    10  13  17  54   0    13         19      22
          yes                8    14  10  14   0  46    20         17       9
issue     guns               8    11   9   5  13  20    33          0       0
          healthcare         7    10   5  14  19  17     0         36       0
          schools            7     3   9  12  22   9     0          0      31