在Caffe中使用ImageNet Resnet-50,预测给出了1000维向量。有没有一种简单的方法可以将此向量的索引转换为Wordnet 3.0 synset标识符?例如,415:'面包店,面包店,面包店'是" n02776631"?
我注意到一个类似的问题,Get ImageNet label for a specific index in the 1000-dimensional output tensor in torch,被问及与索引相关的人类可读标签,并且答案指向此URL中可用的索引到标签映射:{{3} }
从人类可读的标签我想可以通过此页面上的标签到synset映射找到Wordnet同义词集标识符:https://gist.github.com/maraoz/388eddec39d60c6d52d4但是我想知道这是否已经完成了?
答案 0 :(得分:0)
来自https://gist.github.com/maraoz/388eddec39d60c6d52d4和http://image-net.org/challenges/LSVRC/2015/browse-synsets的数据的映射似乎很简单:
{0: {'id': '01440764-n',
'label': 'tench, Tinca tinca',
'uri': 'http://wordnet-rdf.princeton.edu/wn30/01440764-n'},
1: {'id': '01443537-n',
'label': 'goldfish, Carassius auratus',
'uri': 'http://wordnet-rdf.princeton.edu/wn30/01443537-n'},
2: {'id': '01484850-n',
'label': 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
'uri': 'http://wordnet-rdf.princeton.edu/wn30/01484850-n'},
...
请参阅https://gist.github.com/fnielsen/4a5c94eaa6dcdf29b7a62d886f540372获取完整档案。
我没有彻底检查这个映射是否真的正确。
此映射构建于:
import ast
from lxml import html
import requests
from pprint import pprint
url_index = ('https://gist.githubusercontent.com/maraoz/'
'388eddec39d60c6d52d4/raw/'
'791d5b370e4e31a4e9058d49005be4888ca98472/gistfile1.txt')
url_synsets = "http://image-net.org/challenges/LSVRC/2014/browse-synsets"
index_to_label = ast.literal_eval(requests.get(url_index).content)
elements = html.fromstring(requests.get(url_synsets).content).xpath('//a')
label_to_synset = {}
for element in elements:
href = element.attrib['href']
if href.startswith('http://imagenet.stanford.edu/synset?wnid='):
label_to_synset[element.text] = href[42:]
index_to_synset = {
k: {
'id': label_to_synset[v] + '-n',
'label': v,
'uri': "http://wordnet-rdf.princeton.edu/wn30/{}-n".format(
label_to_synset[v])
}
for k, v in index_to_label.items()}
pprint(index_to_synset)