我的数组如下:
foo foot oot
foo foot oot
bar bart art
bar bart art
我有词典
{ foo : 1, bar :2, foot:34, bart:54, oot:123}
想要映射或应用此输出的集合:
1 34 123
1 34 123
2 54 NaN
2 54 NaN
注意:缺少一个值。我在考虑切片每列, 然后做一个列表comprehesion,但这感觉不对。
答案 0 :(得分:2)
使用list-comprehension:
>>> lis = [['foo', 'foot', 'oot'],
['foo', 'foot', 'oot'],
['bar', 'bart', 'art'],
['bar', 'bart', 'art']]
>>> dic = { 'foo' : 1, 'bar' :2, 'foot':34, 'bart':54, 'oot':123}
>>> nan = float('nan')
>>> [[dic.get(y,nan) for y in x] for x in lis]
[[1, 34, 123], [1, 34, 123], [2, 54, nan], [2, 54, nan]]
dict.get(key, default_value)
:如果找到key
,则返回与key相关的值,否则返回default_value
。
我们不能直接在python中使用NaN
,这就是使用float('nan')
的原因。
答案 1 :(得分:1)
对于大型数组,以下仅限numpy的代码可能会表现得更好:
arr = np.array([['foo', 'foot', 'oot'],
['foo', 'foot', 'oot'],
['bar', 'bart', 'art'],
['bar', 'bart', 'art']])
dict_ = {'foo' : 1, 'bar' : 2, 'foot' : 34, 'bart' : 54, 'oot' : 123}
arr_flat = arr.ravel()
keys = np.array(dict_.keys())
vals = np.array(dict_.values())
sort_idx = np.argsort(keys)
keys = keys[sort_idx]
vals = vals[sort_idx]
vals = np.concatenate((vals, [np.nan]))
unique, indices = np.unique(arr_flat, return_inverse=True)
locs = np.searchsorted(keys, unique, side='left')
no_match = unique != keys[locs]
locs[no_match] = len(keys)
new_arr = np.take(vals, np.take(locs, indices)).reshape(arr.shape)
# Same as new_arr = vals[locs[indices]].reshape(arr.shape)
>>> arr
array([['foo', 'foot', 'oot'],
['foo', 'foot', 'oot'],
['bar', 'bart', 'art'],
['bar', 'bart', 'art']],
dtype='|S4')
>>> new_arr
array([[ 1., 34., 123.],
[ 1., 34., 123.],
[ 2., 54., nan],
[ 2., 54., nan]])