如何正确连接(或者这可能是.merge()
?)N个具有相同列名的数据帧,以便我可以使用不同的列键对它们进行分组。例如:
dfs = {
'A': df1, // columns are C1, C2, C3
'B': df2, // same columns C1, C2, C3
}
gathered_df = pd.concat(dfs.values()).groupby(['C2'])['C3']\
.count()\
.sort_values(ascending=False)\
.reset_index()
我希望得到像
这样的东西|----------|------------|-------------|
| | A | B |
| C2_val1 | count_perA | count_perB |
| C2_val2 | count_perA | count_perB |
| C2_val3 | count_perA | count_perB |
答案 0 :(得分:1)
我认为您需要reset_index
来创建来自<div><p> </p></div>
的列,然后将列添加到MultiIndex
区分数据帧。最后重塑unstack
:
groupby
What is the difference between size and count in pandas?
样品:
gathered_df = pd.concat(dfs).reset_index().groupby(['C2','level_0'])['C3'].count().unstack()