我有一个递增的向量,例如以下一个:
set.seed(1)
numbers <- cumsum(abs(rnorm(10,100,100)))
# [1] 37.35462 155.71895 172.15609 431.68417 564.63495 582.58811 731.33101 905.16348 1062.74162 1132.20278
我想选择一个最少的有效数字,然后四舍五入这些数字,以确保我始终保持足够的数字,以免连续的数字不会四舍五入到相同的值。
请参见以下示例(预期输出):
magic(numbers, n = 1)
# [1] 40 160 170 400 560 580 700 900 1060 1130
37.35462
舍入到40
,因为在可能的情况下我只在这里要求输入一位数字(n = 1
)155.71895
舍入到200
,因为172.15609
也将被同样的规则舍入到200
,所以我将155.71895
舍入到{ {1}}和160
至172.15609
170
舍入到431.68417
,因为它离400
和172.15609
足够远了等...
对于n = 2或3,我们将得到:
564.63495
我的目标是获取非线性分布的分位数的可读值。
答案 0 :(得分:5)
#' Minimum preferred significant digits
#'
#' @details
#' Facilitate reducing numbers to their least *distinguishable*
#' significant digits, where "distinguishable" means
#' "between neighbors". This means that if reducing more digits would
#' cause two neighbors to reduce to the same number, then the
#' reduction cannot take place.
#'
#' References:
#'
#' - [Original question on StackOverflow](https://stackoverflow.com/q/51616332/3358272) (and [my answer](https://stackoverflow.com/a/51617325/3358272))
#'
#' @param numbers numeric, length 2 or more
#' @param n integer, number of preferred remaining significant digits
#' @return numeric vector
#' @export
#' @md
#' @examples
#' \dontrun{
#' set.seed(1)
#' numbers <- cumsum(abs(rnorm(10,100,100)))
#' # [1] 37.35462 155.71895 172.15609 431.68417 564.63495 582.58811 731.33101 905.16348 1062.74162 1132.20278
#' magic(numbers, 1)
#' # [1] 40 160 170 400 560 580 700 900 1060 1130
#' magic(numbers, 2)
#' # [1] 37 160 170 430 560 580 730 910 1060 1130
#' magic(numbers, 3)
#' # [1] 37.4 156.0 172.0 432.0 565.0 583.0 731.0 905.0 1060.0 1130.0
#' magic(c(1,2.4,2.6,4),1)
#' # [1] 1 2 3 4
#' }
magic <- function(numbers, n=1L) {
stopifnot(length(numbers) > 1L)
logscale <- ceiling(log10(abs(numbers)))
logdiff <- log10(diff(numbers))
keepoom <- floor(pmin(c(Inf, logdiff), c(logdiff, Inf)))
roundpoints <- 5*(10^keepoom)
out <- signif(numbers, pmax(n, logscale - (1+keepoom)))
dupes <- duplicated(out)
if (any(dupes)) {
dupes <- dupes | c(dupes[-1], FALSE)
out2 <- signif(numbers, pmax(n, logscale - keepoom))
out[dupes] <- out2[dupes]
}
out
}
样品用量:
magic(numbers, 1)
# [1] 40 160 170 400 560 580 700 900 1060 1130
## [1] 40 160 170 400 560 580 700 900 1060 1130 # yours
magic(numbers, 2)
# [1] 37 160 170 430 560 580 730 910 1060 1130
## [1] 37 160 170 430 560 580 730 910 1060 1130 # yours
magic(numbers, 3)
# [1] 37.4 156.0 172.0 432.0 565.0 583.0 731.0 905.0 1060.0 1130.0
## [1] 37.4 156 172 432 565 583 731 905 1060 1130 # yours
magic(c(1,2.4,2.6,4),1)
# [1] 1 2 3 4
## [1] 1:4 # yours, from comments
答案 1 :(得分:1)
我想出了一个递归选项,从@ r2evans借用signif
:
magic <- function(numbers,n){
rounded <- signif(numbers,n)
dupes <- duplicated(rounded) | duplicated(rounded,fromLast = TRUE)
if (any(dupes)) rounded[dupes] <- magic(numbers[dupes], n+1)
rounded
}
magic(numbers,1)
# [1] 40 160 170 400 560 580 700 900 1060 1130
magic(numbers,2)
# [1] 37 160 170 430 560 580 730 910 1060 1130
magic(numbers,3)
# [1] 37.4 156.0 172.0 432.0 565.0 583.0 731.0 905.0 1060.0 1130.0
如@DigEmAll在评论中所述,当原始向量中存在重复项时,它会失败(这在我陈述的用例中确实可能发生)。
答案 2 :(得分:-1)
i=0
while(length(unique(numbers))==length(numbers)&&i<20){i<-i+1;numbers<-round(numbers,digits=(20-i));}
只要代码之间的长度不再相等或超过您的i,此代码就会运行,当您的视差很小时,只需将20调整为更高的值即可。
希望有帮助。
我想选择一个最少的有效数字然后四舍五入 这些数字,确保我始终保持足够的数字,以便 连续的数字将不会四舍五入为相同的值。
根据声明,我的结果是:
set.seed(1)
numbers <- cumsum(abs(rnorm(10,100,100)))
numbers<-numbers/10000
i=0
while(length(unique(numbers))==length(numbers)&&i<20){i<-i+1;numbers<-round(numbers,digits=(20-i));}
编辑:我看到问题出在哪里:您不仅要舍入数字,还要舍入逗号上方的值,如果要舍入,只需变换变量,将其除以(在这种情况下为10000)并乘以他们之后。但是我想我发现了另一个错误,该代码实际上只提供了i,因此您需要运行:
set.seed(1)
numbers <- cumsum(abs(rnorm(10,100,100)))
然后使用之前的i来运行
round(numbers/10000,digits=(20-i+1))*10000
对不起,那一团糟不得不离开,只看一下结果。