我有一个巨大的numpy张量,我有一个庞大的字典,在张量中取代条目(假设它们是键)的最快方法是用字典中的相应值替换。
例如我有几百万个条目,如:
np.asarray([[[1,2,3],[4,5,6],[7,8,9],[2,4,5,]],
[[2,3,4],[7,8,9],[10,11,23],[6,3,1]],
[[4,55,6],[90,8,2],[1,2,3],[0,94,1]],
[[6,7,8],[3,4,5],[6,7,8],[9,8,2]],
[[9,8,8],[4,5,6],[34,55,6],[3,52,2]]
...................................
...................................]
dictionary = {4:6,5:67,8:99,.........} #million entries in dictionary
答案 0 :(得分:1)
不确定这是否是执行此操作的最快方法,但请转到此处:
In [33]: arr = np.asarray([[[1,2,3],[4,5,6],[7,8,9],[2,4,5,]],
...: [[2,3,4],[7,8,9],[10,11,23],[6,3,1]],
...: [[4,55,6],[90,8,2],[1,2,3],[0,94,1]],
...: [[6,7,8],[3,4,5],[6,7,8],[9,8,2]],
...: [[9,8,8],[4,5,6],[34,55,6],[3,52,2]]])
In [34]: dct = {int(random.random()*100): int(random.random()*100) for _ in xrange(100)}
In [35]: arr.ravel()[:] = np.fromiter((dct.get(x, x) for x in arr.ravel()), dtype=arr.dtype)
In [36]: arr
Out[36]:
array([[[18, 94, 53],
[71, 73, 6],
[35, 7, 9],
[94, 71, 73]],
[[94, 53, 71],
[35, 7, 9],
[10, 42, 15],
[ 6, 53, 18]],
[[71, 50, 6],
[90, 7, 94],
[18, 94, 53],
[ 0, 94, 18]],
[[ 6, 35, 7],
[53, 71, 73],
[ 6, 35, 7],
[ 9, 7, 94]],
[[ 9, 7, 7],
[71, 73, 6],
[99, 50, 6],
[53, 52, 94]]])
答案 1 :(得分:0)
如果你的dict有整数键,你可以简单地将它转换为v
v[i] = dictionary[i]
的向量。然后,只用大张量索引v
。
# Assuming foo is the big tensor and d is the dictionary
# Set up v
v = np.arange(max(d.keys())+1)
for key,val in d.iteritems():
v[key] = val
# Do the replacements all at once:
replaced = v[foo]
如果您遇到IndexError
问题,请尝试使用v = np.arange(foo.max()+1)
。
答案 2 :(得分:0)
np.asarray([[[1,2,3],[4,5,6],[7,8,9],[2,4,5,]],
[[2,3,4],[7,8,9],[10,11,23],[6,3,1]],
[[4,55,6],[90,8,2],[1,2,3],[0,94,1]],
[[6,7,8],[3,4,5],[6,7,8],[9,8,2]],
[[9,8,8],[4,5,6],[34,55,6],[3,52,2]]
...................................
...................................]
dictionary = {4:6,5:67,8:99,.........}
#you must do something like this
for k,v in dictionary.iteritems():
dictionary[array[k]]=array[v]