这是我的问题。
## data for example
Name type Value1 Value2 Value3 Value4
A unemp 1.733275e+09 2.067889e+09 3.279421e+09 3.223396e+09
B unemp 1.413758e+09 2.004171e+09 2.383106e+09 2.540857e+09
C unemp 1.287548e+09 1.462072e+09 2.831217e+09 3.528558e+09
A unemp 2.651480e+09 2.846055e+09 5.882084e+09 5.247459e+09
D unemp 2.257016e+09 4.121532e+09 4.961291e+09 5.330930e+09
C unemp 7.156784e+08 1.182770e+09 1.704251e+09 2.587171e+09
E emp 6.012397e+09 9.692455e+09 2.288822e+10 3.215460e+10
F emp 5.647393e+09 9.597211e+09 2.121828e+10 3.107219e+10
G emp 4.617047e+09 8.030113e+09 2.005203e+10 2.755665e+10
我的目标:比较“名称”列并将行与相同的“名称”组合在一起。
使用以下代码:
f_test = pd.read_clipboard()
f_test.groupby('Name').sum().reset_index()
结果显示如下。 但是如何保留“类型”列?希望得到别人的建议!
答案 0 :(得分:2)
您可以使用原始DataFrame的列子集merge
结果:
>>> pd.merge(
f_test.groupby('Name').sum().reset_index(),
f_test[['Name', 'type']].drop_duplicates(),
how='right')
Name Value1 Value2 Value3 Value4 type
0 A 4.384755e+09 4.913944e+09 9.161505e+09 8.470855e+09 unemp
1 B 1.413758e+09 2.004171e+09 2.383106e+09 2.540857e+09 unemp
2 C 2.003226e+09 2.644842e+09 4.535468e+09 6.115729e+09 unemp
3 D 2.257016e+09 4.121532e+09 4.961291e+09 5.330930e+09 unemp
4 E 6.012397e+09 9.692455e+09 2.288822e+10 3.215460e+10 emp
5 F 5.647393e+09 9.597211e+09 2.121828e+10 3.107219e+10 emp
6 G 4.617047e+09 8.030113e+09 2.005203e+10 2.755665e+10 emp