Caffe有两个数据类Datum类和Blob。
(1)Could somebody explain what is the difference and advantages upon each other?
我查询的原因是
我正在使用caffe库构建两个网络。 一个网络将通过输入图像并提取特征。
然后第二个网络将该功能作为输入并向前进行检测。
令我困惑的是,我是否应该使用Blob格式的第一个网络或Datum来提取要素以输入到第二个网络中。因为caffe库中的feature_extraction示例将Blob数据更改为Datum以保存数据。请查看以下代码。
Datum datum;
std::vector<int> image_indices(num_features, 0);
for (int batch_index = 0; batch_index < num_mini_batches; ++batch_index) {
feature_extraction_net->Forward();
for (int i = 0; i < num_features; ++i) {
const boost::shared_ptr<Blob<Dtype> > feature_blob =
feature_extraction_net->blob_by_name(blob_names[i]);
int batch_size = feature_blob->num();
int dim_features = feature_blob->count() / batch_size;
const Dtype* feature_blob_data;
for (int n = 0; n < batch_size; ++n) {
datum.set_height(feature_blob->height());
datum.set_width(feature_blob->width());
datum.set_channels(feature_blob->channels());
datum.clear_data();
datum.clear_float_data();
feature_blob_data = feature_blob->cpu_data() +
feature_blob->offset(n);
for (int d = 0; d < dim_features; ++d) {
datum.add_float_data(feature_blob_data[d]);
}
string key_str = caffe::format_int(image_indices[i], 10);
string out;
CHECK(datum.SerializeToString(&out));
txns.at(i)->Put(key_str, out);
++image_indices[i];
if (image_indices[i] % 1000 == 0) {
txns.at(i)->Commit();
txns.at(i).reset(feature_dbs.at(i)->NewTransaction());
LOG(ERROR)<< "Extracted features of " << image_indices[i] <<
" query images for feature blob " << blob_names[i];
}
} // for (int n = 0; n < batch_size; ++n)
} // for (int i = 0; i < num_features; ++i)
} // for (int batch_index = 0; batch_index < num_mini_batches; ++batch_index)
在代码中,num_mini_batches为10,batch_size为50,dim_features为4096. dim_features清除输出,因此它是输出卷的深度。
(2)What do batch_size 50 and num_mini_batches 10 mean for? Why we need to run through 10 times to have fc7 features in the image?
有人可以解释我这两个问题吗?