这是我想要的python pandas问题。
我有一个要简化的表格:
+--------+-------+--------+-----------+
| Gender | State | Age | Purchased |
+--------+-------+--------+-----------+
| Male | NV | Adult | Yes |
| Female | NV | Adult | Yes |
| Male | FL | Teen | Yes |
| Male | FL | Adult | No |
| Female | NV | Teen | No |
| Female | NY | Senior | Yes |
| Male | NY | Senior | Yes |
| Female | NY | Adult | Yes |
| Female | NV | Teen | Yes |
| Male | NV | Adult | No |
| Female | FL | Senior | Yes |
| Male | Fl | Teen | No |
| Male | NY | Teen | Yes |
| Female | NV | Adult | No |
+--------+-------+--------+-----------+
我想合并每列的类别类型,同时计算“已购买”的数量,从而有效地生成如下内容:
+--------+----------+-----------+----+
| | | Purchased |
+--------+----------+-----------+----+
| | | Yes | No |
| Gender | Male | 4 | 3 |
| | Female | 5 | 2 |
| State | State FL | 2 | 2 |
| | State NV | 3 | 3 |
| | State NY | 4 | 0 |
| Age | Senior | 3 | 0 |
| | Adult | 3 | 3 |
| | Teen | 3 | 2 |
+--------+----------+-----------+----+
答案 0 :(得分:1)
crosstab
+ concat
pd.concat([pd.crosstab(df[x],df.Purchased)for x in df.columns[:-1]],keys=df.columns[:-1])
Out[273]:
Purchased No Yes
Gender Female 2 5
Male 3 4
State FL 1 2
Fl 1 0
NV 3 3
NY 0 4
Age Adult 3 3
Senior 0 3
Teen 2 3
答案 1 :(得分:1)
我的方法:
a = {}
for col in ['Gender', 'State', 'Age']:
a[col] = (df.groupby(col).Purchased.value_counts().unstack())
pd.concat(a)