Question

我不确定如何命名，但是如果我有一个值列表，如：

set.seed(2084)
vals = round(runif(12, 1, 3))

[1] 2 3 2 2 1 1 3 2 1 2 2 2

我将其排序：

[1] 1 1 1 2 2 2 2 2 2 2 3 3

但是我想得到一些类似的排序方式（以1-2-3重复）：

1 2 3 1 2 3 1 2 2 2 2 2

但是问题是有7个二，并且它们在列表的后面而不是在其他值之间交替。我希望得到类似的东西：

1 2 2 3 1 2 2 2 1 2 2 3

或

1 2 2 3 1 2 2 3 1 2 2 2

如何索引它以获得此“增长和切割”顺序的最均匀分布值？

对我的想法发表一些看法。从N个唯一值的集合（此处为3个）中，我们需要将它们中的每个分布尽可能地彼此分开（并且远离该向量的边界）。因此，虽然我们有1个插槽和10个插槽，但我们可以按照以下方式放置它们：

= 1 = = = 1 = = = 1

或

= = 1 = = 1 = = 1 =

这是正确的，除非其他数字在此列表中具有适当的位置。我们可以加三：

= 3 1 = = 1 = 3 1 =

现在，我们只能填写两位。他们将没有理想的职位。我认为最好从值最丰富的数字开始。

我想保持清晰并描述一些算法，但是我认为它反之亦然。

＃EDIT＃ 我猜对于更大的数据集，这个话题可能是“如何使用R在给定向量中均匀分布值”。如果集会导致误会，也许这可以安全地退出这种情况。但是在这里，我不希望有2个数字，但有5个插槽可用。

对于1 2 2 3 4，还有另一种选择，例如1 2 3 4 2。

编辑2

我找到了一个包含2个值的函数-这是半解决方案，但是这个想法有效。我认为要迭代两个以上的值，但也许我错了。

不是很优雅

antisort <- function(vals) {
  l = length(unique(vals))
  mx = names(which.max(table(vals)))
  mn = names(which.min(table(vals)))
  mxn = max(table(vals))
  indx = round(seq(from = 1, to = length(vals), length.out = mxn))
  vec = NULL
  for (i in indx) {
    vec[i] <- mx
  }
  vec[which(is.na(vec))] <- mn
  return(vec)
}

数据：

set.seed(2201)
vals = round(runif(12, 1, 2))

运行：

antisort(vals)

结果（不管它是不是字符串）

“ 2”“ 1”“ 2”“ 1”“ 2”“ 2”“ 1”“ 2”“ 1”“ 2”“ 1”“ 2”

Answer 1

其中之一可能就是您所追求的：

rep_len(unique(vals), length(vals))

或

rep_len(sort(unique(vals)), length(vals))

Answer 2

这是一种可能的启发式方法：

set.seed(2084)
maxn <- 3
vals = round(runif(12, 1, maxn)) #integral values

#result vector
v <- rep(NA_character_, length(vals))

#tabulate frequencies and sort in descending order
lens <- sort(table(vals), decreasing=TRUE)

#going through each distinct integral values, starting with the longest one
for (x in names(lens)) {
    #cut the result vector into roughly lens[x] number of parts
    idx <- cut(seq_along(v), breaks=lens[x])

    #fill the first NA with the current integral value
    split(v, idx) <- lapply(split(v, idx), function(subv) {
        subv[which(is.na(subv))[1L]] <- x
        subv
    })
}

#split the vector into maxn number of parts and sort each group
#the hardest part is probably how many parts to split into, which is defaulted
#to maximum of integral values in the original vector
lapply(split(v, cut(seq_along(v), breaks=maxn, labels=1L:maxn)), sort)

输出：

$`1`
[1] "1" "2" "2" "3"

$`2`
[1] "1" "2" "2" "2"

$`3`
[1] "1" "2" "2" "3"

相关链接：https://cs.stackexchange.com/questions/29709/algorithm-to-distribute-items-evenly

如何使用R

2 个答案: