我的熊猫数据框如下
A, B
----
a, 2
a, 5
a, 6
b, 1
b, 2
我想对A
列进行分组,并对B
列中的值求和,并将其附加为另一列并创建以下数据框
A, B, SUM
--------
a, 2, 13
a, 5, 13
a, 6, 13
b, 1, 3
b, 2, 3
如何在熊猫中做到这一点?
答案 0 :(得分:3)
使用from multiprocessing import Pool, current_process
def foo(filename):
# Hacky way to get a GPU id using process name (format "ForkPoolWorker-%d")
gpu_id = (int(current_process().name.split('-')[-1]) - 1) % 4
# run processing on GPU <gpu_id>
ident = current_process().ident
print('{}: starting process on GPU {}'.format(ident, gpu_id))
# ... process filename
print('{}: finished'.format(ident))
pool = Pool(processes=4*2)
files = ['file{}.xyz'.format(x) for x in range(1000)]
for _ in pool.imap_unordered(foo, files):
pass
pool.close()
pool.join()
transform