我正在寻找一种方法来连接两个包含numpy数组的python词典中的值,同时避免手动循环字典键。例如:
import numpy as np
# Create first dictionary
n = 5
s = np.random.randint(1,101,n)
r = np.random.rand(n)
d = {"r":r,"s":s}
print "d = ",d
# Create second dictionary
n = 2
s = np.random.randint(1,101,n)
r = np.random.rand(n)
t = np.array(["a","b"])
d2 = {"r":r,"s":s,"t":t}
print "d2 = ",d2
# Some operation to combine the two dictionaries...
d = SomeOperation(d,d2)
# Updated dictionary
print "d3 = ",d
给出输出
>> d = {'s': array([75, 25, 88, 54, 82]), 'r': array([ 0.1021227 , 0.99454874, 0.38680718, 0.98720877, 0.8662894 ])}
>> d2 = {'s': array([78, 92]), 'r': array([ 0.27610587, 0.57037473]), 't': array(['a', 'b'], dtype='|S1')}
>> d3 = {'s': array([75, 25, 88, 54, 82, 78, 92]), 'r': array([ 0.1021227 , 0.99454874, 0.38680718, 0.98720877, 0.8662894, 0.27610587, 0.57037473]), 't': array(['a', 'b'], dtype='|S1')}
即。因此,如果密钥已存在,则存储在该密钥下的numpy数组将附加到。
有没有人知道最好的方法,同时尽量减少使用缓慢的手动for
循环? (我想避免循环,因为我想要组合的词典可能有数百个键。)
谢谢!
答案 0 :(得分:4)
您可以使用pandas:
from __future__ import print_function, division
import pandas as pd
import numpy as np
# Create first dictionary
n = 5
s = np.random.randint(1,101,n)
r = np.random.rand(n)
d = {"r":r,"s":s}
df = pd.DataFrame(d)
print(df)
# Create second dictionary
n = 2
s = np.random.randint(1,101,n)
r = np.random.rand(n)
t = np.array(["a","b"])
d2 = {"r":r,"s":s,"t":t}
df2 = pd.DataFrame(d2)
print(df2)
print(pd.concat([df, df2]))
输出:
r s
0 0.551402 49
1 0.620870 34
2 0.535525 52
3 0.920922 13
4 0.708109 48
r s t
0 0.231480 43 a
1 0.492576 10 b
r s t
0 0.551402 49 NaN
1 0.620870 34 NaN
2 0.535525 52 NaN
3 0.920922 13 NaN
4 0.708109 48 NaN
0 0.231480 43 a
1 0.492576 10 b