我正在寻找一种更优雅的解决方案,以获取每个熊猫组的唯一获奖者名单(最高票数)。
我已经下载了California election results并购买了要在名为create_df
的函数中使用的数据。
df = create_df()
df.head()
candidate county district office party precinct votes
0 JOHN COX ALAMEDA NaN GOVERNOR REP 200100 49.0
1 JOHN COX ALAMEDA NaN GOVERNOR REP 200200 55.0
2 JOHN COX ALAMEDA NaN GOVERNOR REP 200300 26.0
3 JOHN COX ALAMEDA NaN GOVERNOR REP 200600 28.0
4 JOHN COX ALAMEDA NaN GOVERNOR REP 200700 35.0
我当前的实现是这样的:
county_votes = df.query("office == 'GOVERNOR'")\
.groupby(["county", "party"], as_index=False)\
.votes.sum()
winners = county_votes.reindex(
county_votes.groupby("county").votes.idxmax().values
)[["county", "party"]]
winner.head()
county party
0 ALAMEDA DEM
2 ALPINE DEM
5 AMADOR REP
7 BUTTE REP
9 CALAVERAS REP
有更好的方法吗?
答案 0 :(得分:0)
我找到了另一种方法,而且似乎也更快。
/Users/dromero/Documents/annotator-backend/node_modules/express/lib/router/index.js:458
throw new TypeError('Router.use() requires a middleware function but got a ' + gettype(fn))
^
TypeError: Router.use() requires a middleware function but got a undefined
at Function.use (/Users/dromero/Documents/annotator-backend/node_modules/express/lib/router/index.js:458:13)
at Function.<anonymous> (/Users/dromero/Documents/annotator-backend/node_modules/express/lib/application.js:220:21)
at Array.forEach (<anonymous>)
at Function.use [as _super] (/Users/dromero/Documents/annotator-backend/node_modules/express/lib/application.js:217:7)
at Function.use (/Users/dromero/Documents/annotator-backend/node_modules/@feathersjs/express/lib/index.js:50:28)
at Function.newMethod [as use] (/Users/dromero/Documents/annotator-backend/node_modules/@feathersjs/express/node_modules/uberproto/lib/proto.js:34:20)
at Object.<anonymous> (/Users/dromero/Documents/annotator-backend/src/app.js:11:5)
at Module._compile (internal/modules/cjs/loader.js:776:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:787:10)
at Module.load (internal/modules/cjs/loader.js:653:32)
at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
at Function.Module._load (internal/modules/cjs/loader.js:585:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:829:12)
at startup (internal/bootstrap/node.js:283:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3)
每个循环42.4 ms±97 µs(平均±标准偏差,共运行7次,每个循环10个循环)
%%timeit
county_votes = df.query("office == 'GOVERNOR'")\
.groupby(["county", "party"], as_index=False)\
.votes.sum()
county_votes.reindex(
county_votes.groupby("county").votes.idxmax().values
)[["county", "party"]].head()
每个循环31.6 ms±60.9 µs(平均±标准偏差,共运行7次,每个循环10个循环)