我有一个数字向量:
vec = c(1464.556644,552.6007169,155.4249747,1855.360016,1315.874155,2047.980206,2361.475519,4130.530507,1609.572131,4298.980363,697.6034771,312.080866,2790.738644,1116.406288,989.6391649,2683.393338,3032.080837,2462.137352,2964.362507,1182.894473,1268.968128,4495.503015,576.1063996,232.4996213,1355.256694,1336.607876,2506.458008,1242.918255,3645.587384)
我希望从中抽取尽可能相等的n=5
个点。换句话说,我想从vec
获得最接近这些点的点数:
seq(min(vec),max(vec),(max(vec)-min(vec))/(n-1))
实现这一目标的最快方法是什么?
答案 0 :(得分:6)
我赞成了这封邮件,如果没有这么快发布,这也是我的快速而肮脏的解决方案: - )
我认为更强大的方法是解决整数规划问题。例如,它可以防止多次选择同一点的可能性。
n <- 5
N <- length(vec)
ideal <- seq(min(vec),max(vec),(max(vec)-min(vec))/(n-1))
library(lpSolve)
cost.mat <- outer(ideal, vec, function(x, y) abs(x-y))
row.signs <- rep("==", n)
row.rhs <- rep(1, n)
col.signs <- rep("<=", N)
col.rhs <- rep(1, N)
sol <- lp.transport(cost.mat, "min", row.signs, row.rhs,
col.signs, col.rhs)$solution
final <- vec[apply(sol, 1, which.max)]
这肯定会慢一点,但它是唯一的&#34;最佳和100%可靠的&#34;在我看来。
答案 1 :(得分:3)
尝试:
ideal <- seq(min(vec),max(vec),(max(vec)-min(vec))/(n-1))
result <- sapply(ideal, function(x) vec[which.min(abs(vec-x))] )
比较
cbind(result,ideal)
result ideal
[1,] 155.425 155.425
[2,] 1242.918 1240.444
[3,] 2361.476 2325.464
[4,] 3645.587 3410.484
[5,] 4495.503 4495.503