我刚刚下载了Matlab工具箱Vlfeat来对描述符denseSIFT进行一些研究。它有一个很棒的函数叫做vl_dsift()来满足我。但是,我仍然对vl_dsift.m中对此函数的描述有一点疑问,为方便起见,我只是复制一下。
% FURTHER DETAILS ON THE GEOMETRY
%
% As mentioned, the VL_DSIFT() descriptors cover the bounding box
% specified by BOUNDS = [XMIN YMIN XMAX YMAX]. Thus the top-left bin
% of the top-left descriptor is placed at (XMIN, YMIN). The next
% three bins to the right are at XMIN + SIZE, XMIN + 2*SIZE, XMIN +
% 3*SIZE. The X coordiante of the center of the first descriptor is
% therefore at (XMIN + XMIN + 3*SIZE) / 2 = XMIN + 3/2 * SIZE. For
% instance, if XMIN = 1 and SIZE = 3 (default values), the X
% coordinate of the center of the first descriptor is at 1 + 3/2 * 3
% = 5.5. For the second descriptor immediately to its right this is
% 5.5 + STEP, and so on.
如上所述,XMIN和SIZE的默认值是1和3,并且根据描述,中心X坐标是(1 + 1 + 3 * 3)/2=5.5。但正如我们所知,SIFT描述符在4 * 4箱区域中计算,每个箱子在此有3个像素,因此该区域的整个区域实际上是4 * 3 * 4 * 3 = 144,每列/行是3 * 4 = 12长度,该区域的角应为(12 / 2,12 / 2)=(6,6),而不是(5.5,5.5),如上所述。这个问题让我困扰了一段时间。有谁知道真相吗?非常感谢你!