从数字向量中采样等距点

时间:2014-04-18 00:25:08

标签: r optimization sample

我有一个数字向量:

vec = c(1464.556644,552.6007169,155.4249747,1855.360016,1315.874155,2047.980206,2361.475519,4130.530507,1609.572131,4298.980363,697.6034771,312.080866,2790.738644,1116.406288,989.6391649,2683.393338,3032.080837,2462.137352,2964.362507,1182.894473,1268.968128,4495.503015,576.1063996,232.4996213,1355.256694,1336.607876,2506.458008,1242.918255,3645.587384)

我希望从中抽取尽可能相等的n=5个点。换句话说,我想从vec获得最接近这些点的点数:

seq(min(vec),max(vec),(max(vec)-min(vec))/(n-1))

实现这一目标的最快方法是什么?

2 个答案:

答案 0 :(得分:6)

我赞成了这封邮件,如果没有这么快发布,这也是我的快速而肮脏的解决方案: - )

我认为更强大的方法是解决整数规划问题。例如,它可以防止多次选择同一点的可能性。

n <- 5
N <- length(vec)
ideal <- seq(min(vec),max(vec),(max(vec)-min(vec))/(n-1))

library(lpSolve)
cost.mat  <- outer(ideal, vec, function(x, y) abs(x-y))
row.signs <- rep("==", n)
row.rhs   <- rep(1, n)
col.signs <- rep("<=", N)
col.rhs   <- rep(1, N)
sol <- lp.transport(cost.mat, "min", row.signs, row.rhs,
                                     col.signs, col.rhs)$solution

final <- vec[apply(sol, 1, which.max)]

这肯定会慢一点,但它是唯一的&#34;最佳和100%可靠的&#34;在我看来。

答案 1 :(得分:3)

尝试:

ideal <- seq(min(vec),max(vec),(max(vec)-min(vec))/(n-1))
result <- sapply(ideal, function(x) vec[which.min(abs(vec-x))] )

比较

cbind(result,ideal)

       result    ideal
[1,]  155.425  155.425
[2,] 1242.918 1240.444
[3,] 2361.476 2325.464
[4,] 3645.587 3410.484
[5,] 4495.503 4495.503