我正在处理带有视觉词汇的场景识别问题。这是我从互联网上改编的代码。在训练数据集中,我有5个类,每个类有100个图像。在随机测试数据集中,我有5000张图像。我明白我应该从训练集中学习词汇。但是我还应该制作测试数据集的词汇表吗?
FEATURE = 'bag of sift';
CLASSIFIER = 'support vector machine';
categories = {'shopping', 'office', 'eating', 'chatting', 'biking'};
num_train_per_cat = 100;
vocab_size = 200;
% YOUR CODE FOR build_vocabulary.m
vocab = build_vocabulary(train_image_paths, vocab_size);
% YOUR CODE FOR get_bags_of_sifts.m
fprintf('Computing training features\n');
train_image_feats = get_bags_of_sifts(train_image_paths,vocab);
save('train_bag.mat', 'train_image_feats');
fprintf('Computing test features\n');
test_image_feats = get_bags_of_sifts(test_image_paths,vocab);
% YOUR CODE FOR svm_classify.m
test_image_feats_mat = cell2mat( test_image_feats);
test_image_feats= vl_svmdataset(test_image_feats_mat);
predicted_categories = svm_classify(train_image_feats,train_labels, test_image_feats)
答案 0 :(得分:1)
关于您的问题,您不能从测试数据集中创建词汇表。 您必须使用encode方法计算测试图像中的可视单词出现次数。编码方法生成的直方图成为图像的新的缩小表示。
示例:
features = encode(vocabulary, img)
总而言之,您必须对训练/测试数据集进行编码。 encode方法的输出成为分类器的输入。