我们可以改变大熊猫的交叉制表法吗?

时间:2019-12-06 05:15:03

标签: python pandas

我已经使用sqlalchemy和pymysql从MySQL加载了raw_data

engine = create_engine('mysql+pymysql://[user]:[passwd]@[host]:[port]/[database]')

df = pd.read_sql_table('data', engine)

df就是这样

| Age Category | Category       |
|--------------|----------------|
| 31-26        | Engaged        |
| 26-31        | Engaged        |
| 31-36        | Not Engaged    |
| Above 51     | Engaged        |
| 41-46        | Disengaged     |
| 46-51        | Nearly Engaged |
| 26-31        | Disengaged     |

然后我进行了如下分析

age = pd.crosstab(df['Age Category'], df['Category'])

| Category     | A | B  | C  | D |
|--------------|---|----|----|---|
| Age Category |   |    |    |   |
| 21-26        | 2 | 2  | 4  | 1 |
| 26-31        | 7 | 11 | 12 | 5 |
| 31-36        | 3 | 5  | 5  | 2 |
| 36-41        | 2 | 4  | 1  | 7 |
| 41-46        | 0 | 1  | 3  | 2 |
| 46-51        | 0 | 0  | 2  | 3 |
| Above 51     | 0 | 3  | 0  | 6 |

我想将其更改为 熊猫DataFrame是这样的。

| Age Category | A | B  | C  | D |
|--------------|---|----|----|---|
| 21-26        | 2 | 2  | 4  | 1 |
| 26-31        | 7 | 11 | 12 | 5 |
| 31-36        | 3 | 5  | 5  | 2 |
| 36-41        | 2 | 4  | 1  | 7 |
| 41-46        | 0 | 1  | 3  | 2 |
| 46-51        | 0 | 0  | 2  | 3 |
| Above 51     | 0 | 3  | 0  | 6 |

感谢您的时间和考虑

1 个答案:

答案 0 :(得分:2)

这两种文本都称为列和索引名,更改它们的解决方案是使用DataFrame.rename_axis

age = age.rename_axis(index=None, columns='Age Category')

或通过索引名称设置列名称,然后将索引名称设置为默认值-None

age.columns.name = age.index.name
age.index.name = None

print (age)
Age Category  Disengaged  Engaged  Nearly Engaged  Not Engaged
26-31                  1        1               0            0
31-26                  0        1               0            0
31-36                  0        0               0            1
41-46                  1        0               0            0
46-51                  0        0               1            0
Above 51               0        1               0            0

但是这些文本类似于元数据,因此某些功能应将其删除。