我目前在java中使用此Paper (Efficiently selecting spatially distributed keypoints for visual tracking)实现了算法。 我没有从论文中做出以下建议(第5节末尾的第3页):
相对昂贵的电池盖操作可以是实质性的 通过使用单个位来存储每个单元的状态 Gr。这使得使用按位OR运算能够“覆盖”连续 使用预先计算的位掩码实现补丁 覆盖应用于给定的位偏移位置。
throughput ~3 500 ops/sec
。切换到使用System.arrayCopy进行填充而不是强制执行。
throughput ~5 600 ops/sec
。
优化数组初始化(使用缓存)。
throughput ~6 000 ops/sec
。
throughput ~6 500 ops/sec
。throughput ~6 500 ops/sec
。 :(throughput ~6 700 ops/sec
。现在我没有想法,除了将boolean []转换为byte []并使用位掩码进行设置/获取,如果我已正确理解文章中的建议。
任何挑战者?
以下是JMH测试:
public class KeyPointFilterBenchmark {
private static final int DEFAULT_RADIUS = 10;
@Benchmark
public List<OpenCVKeyPoint> benchmarkFilterByRadius(KeyPointFilterState state) {
return state.filter.filterByRadius(DEFAULT_RADIUS, state.list);
}
@State(Scope.Thread)
public static class KeyPointFilterState {
private static final int NUMBER_OF_POINTS = 12_000;
private static final int IMAGE_WIDTH = 640;
private static final int IMAGE_HEIGHT = 480;
private static final int RESPONSE_RANGE = 255;
private List<OpenCVKeyPoint> list;
private KeyPointFilter filter;
@Setup(Level.Trial)
public void doSetup() {
this.list = new ArrayList<>();
for (int i = 0; i < NUMBER_OF_POINTS; i++) {
double x = Math.random() * IMAGE_WIDTH;
double y = Math.random() * IMAGE_HEIGHT;
float response = (float) (Math.random() * RESPONSE_RANGE);
list.add(new OpenCVKeyPoint(x, y, response));
}
this.filter = new KeyPointFilter(IMAGE_WIDTH, IMAGE_HEIGHT);
}
}
}
目前的实施:
public class KeyPointFilter {
private boolean[] matrix;
private final int rowCount;
private final int colCount;
private int matrixColCount;
private int matrixRowCount;
private boolean[] ones;
private int radiusInitialized;
public KeyPointFilter(int colCount, int rowCount) {
this.colCount = colCount;
this.rowCount = rowCount;
}
void init(int radius) {
if (radiusInitialized == radius) {
// Already initialized, just reset.
this.matrix = new boolean[matrixRowCount * matrixColCount];
return;
}
this.matrixRowCount = rowCount + radius * 2;
this.matrixColCount = colCount + radius * 2;
this.matrix = new boolean[matrixRowCount * matrixColCount];
// Initialize a one array, to use in the coverAround arraycopy optimization.
this.ones = new boolean[matrixColCount];
for (int i = 0; i < ones.length; i++) {
ones[i] = true;
}
radiusInitialized = radius;
}
public List<OpenCVKeyPoint> filterByRadius(int radius, List<OpenCVKeyPoint> input) {
init(radius);
List<OpenCVKeyPoint> filtered = new ArrayList<>();
// Eliminating by covering
for (OpenCVKeyPoint point : input) {
int col = (int) point.getXPos();
int row = (int) point.getYPos();
if (!isSet(col, row)) {
bresenhamFilledCircle(col, row, radius);
filtered.add(point);
}
}
return filtered;
}
void bresenhamFilledCircle(int col, int row, int radius) {
// CHECKSTYLE IGNORE MagicNumber FOR NEXT 1 LINES.
int d = (5 - radius * 4) / 4;
int x = 0;
int y = radius;
int rowOffset = radius + row;
int colOffset = radius + col;
do {
//Since we are filling a circle, we fill using System.arraycopy, from left to right.
int yStart = colOffset - y;
int yLength = 2 * y;
// Row a bottom
System.arraycopy(ones, 0, matrix, getIndex(rowOffset - x, yStart), yLength);
if (x != 0) {
int xStart = colOffset - x;
int xLength = 2 * x;
// Row a top
System.arraycopy(ones, 0, matrix, getIndex(rowOffset + x, yStart), yLength);
// Row b bottom
System.arraycopy(ones, 0, matrix, getIndex(rowOffset - y, xStart), xLength);
// Row b top
System.arraycopy(ones, 0, matrix, getIndex(rowOffset + y, xStart), xLength);
}
if (d < 0) {
d += 2 * x + 1;
} else {
d += 2 * (x - y) + 1;
y--;
}
x++;
} while (x <= y);
}
private int getIndex(int row, int col) {
return row * matrixColCount + col;
}
private void debugArray() {
StringBuilder actualResult = new StringBuilder();
for (int row = 0; row < getRowCount(); row++) {
for (int col = 0; col < getColCount(); col++) {
actualResult.append(isSet(col, row) ? '1' : '0');
}
actualResult.append('\n');
}
System.out.println(actualResult);
}
public boolean isSet(int col, int row) {
return matrix[getIndex(row + radiusInitialized, col + radiusInitialized)];
}
int getRowCount() {
return rowCount;
}
int getColCount() {
return colCount;
}
}
加上要使用的关键点类:
public class OpenCVKeyPoint {
private final double xPos;
private final double yPos;
private final float response;
public OpenCVKeyPoint(double xPos, double yPos, float response) {
this.xPos = xPos;
this.yPos = yPos;
this.response = response;
}
public float getResponse() {
return response;
}
public double getXPos() {
return xPos;
}
public double getYPos() {
return yPos;
}
}
答案 0 :(得分:0)
您可以尽可能多地缓存更多计算和内联函数。
尝试用此替换filterByRadius
并查看是否有任何改进:
public List<OpenCVKeyPoint> filterByRadius(final int radius, List<OpenCVKeyPoint> input) {
init(radius);
// Possibly give a hint to the arraylist on how much space to allocate from the start.
List<OpenCVKeyPoint> filtered = new ArrayList<>();
// calculate once
final int d_init = (5 - radius * 4) / 4;
// Eliminating by covering
for (OpenCVKeyPoint point : input) {
// FIXME do the points need to be doubles, only to be cast to int?
int col = (int) point.getXPos();
int row = (int) point.getYPos();
if (!isSet(col, row)) {
final int rowOffset = (radius + row) * matrixColCount;
final int colOffset = radius + col;
int d = d_init;
int x = 0;
int y = radius;
do {
final int yStart = colOffset - y;
final int yLength = 2 * y;
final int xByMatrixColCount = x * matrixColCount;
final int rowOffsetPlusYStart = rowOffset + yStart;
// Since we are filling a circle, we fill using System.arraycopy, from left to right.
// Row a bottom
System.arraycopy(ones, 0, matrix, (rowOffsetPlusYStart - xByMatrixColCount),
yLength);
if (x != 0) {
// Row a top
System.arraycopy(ones, 0, matrix, (rowOffsetPlusYStart + xByMatrixColCount),
yLength);
// -----
final int xLength = 2 * x;
final int yByMatrixColCount = y * matrixColCount;
final int rowOffsetPlusXStart = rowOffset + colOffset - x;
// Row b bottom
System.arraycopy(ones, 0, matrix, (rowOffsetPlusXStart - yByMatrixColCount),
xLength);
// Row b top
System.arraycopy(ones, 0, matrix, (rowOffsetPlusXStart + yByMatrixColCount),
xLength);
}
if (d < 0) {
d += 2 * x + 1;
} else {
d += 2 * (x - y) + 1;
y--;
}
x++;
} while (x <= y);
filtered.add(point);
}
}
return filtered;
}
这可能不会有太大的改进,但你要求更快,我认为这将会更快,但我没有测量支持我。如果您对此进行基准测试,那么我很想知道结果!
答案 1 :(得分:0)
所以,我想出了一个很好的优化。 通用的Bresenham算法会在圆的顶部和底部附近的相同位置处产生多个涂料,但是通过使用自定义策略, 我们可以有一个特定的绘画,例如10个半径,不再需要,几乎没有任何计算。 半径为10的圆的自定义策略将是这样的:
System.arraycopy(ones, 0, matrix, getIndex(row, col + 7), 6);
System.arraycopy(ones, 0, matrix, getIndex(row + 1, col + 4), 12);
System.arraycopy(ones, 0, matrix, getIndex(row + 2, col + 3), 14);
System.arraycopy(ones, 0, matrix, getIndex(row + 3, col + 2), 16);
System.arraycopy(ones, 0, matrix, getIndex(row + 4, col + 1), 18);
System.arraycopy(ones, 0, matrix, getIndex(row + 5, col + 1), 18);
System.arraycopy(ones, 0, matrix, getIndex(row + 6, col + 1), 18);
System.arraycopy(ones, 0, matrix, getIndex(row + 7, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 8, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 9, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 10, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 11, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 12, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 13, col), 20);
System.arraycopy(ones, 0, matrix, getIndex(row + 14, col + 1), 18);
System.arraycopy(ones, 0, matrix, getIndex(row + 15, col + 1), 18);
System.arraycopy(ones, 0, matrix, getIndex(row + 16, col + 1), 18);
System.arraycopy(ones, 0, matrix, getIndex(row + 17, col + 2), 16);
System.arraycopy(ones, 0, matrix, getIndex(row + 18, col + 3), 14);
System.arraycopy(ones, 0, matrix, getIndex(row + 19, col + 4), 12);
System.arraycopy(ones, 0, matrix, getIndex(row + 20, col + 7), 6);
新的基准测试,并且吞吐量增加,现在达到~8 200 ops / sec。
如果我引入线程可能会更高,并且在parallell中执行列表,但现在这个吞吐量已经足够了。