R将分数转换为整数百分比,最多可累加100

时间:2014-06-26 19:57:05

标签: r integer data-analysis frequency-distribution

我计算了不同事件频率的向量,表示为分数并按降序排序。我需要连接到一个需要正整数百分比的工具,这个百分比必须总和恰好为100.我想以最能代表输入分布的方式生成百分比。也就是说,我希望百分比之间的关系(比率)与输入分数中的百分比最佳匹配,尽管任何非线性导致切割长尾。

我有一个产生这些百分比的功能,但我认为它不是最佳或优雅的。特别是,在使用"愚蠢的整数技巧"之前,我想在数字空间中做更多的工作。

以下是一个示例频率向量:

fractionals <- 1 / (2 ^ c(2, 5:6, 8, rep(9,358)))

这是我的功能:

# Convert vector of fractions to integer percents summing to 100
percentize <- function(fractionals) {
  # fractionals is sorted descending and adds up to 1
  # drop elements that wouldn't round up to 1% vs. running total
  pctOfCum <- fractionals / cumsum(fractionals)
  fractionals <- fractionals[pctOfCum > 0.005]

  # calculate initial percentages
  percentages <- round((fractionals / sum(fractionals)) * 100)

  # if sum of percentages exceeds 100, remove proportionally
  i <- 1
  while (sum(percentages) > 100) {
    excess <- sum(percentages) - 100
    if (i > length(percentages)) {
      i <- 1
    }
    partialExcess <- max(1, round((excess * percentages[i]) / 100))
    percentages[i] <- percentages[i] - min(partialExcess,
                                           percentages[i] - 1)
    i <- i + 1
  }

  # if sum of percentages shorts 100, add proportionally
  i <- 1
  while (sum(percentages) < 100) {
    shortage <- 100 - sum(percentages)
    if (i > length(percentages)) {
      i <- 1
    }
    partialShortage <- max(1, round((shortage * percentages[i]) / 100))
    percentages[i] <- percentages[i] + partialShortage
    i <- i + 1
  }

  return(percentages)
}

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

这个怎么样?它重新调整变量,使它应该加到100,但如果由于四舍五入到99,它会将最大频率加1。

fractionals <- 1 / (2 ^ c(2, 5:6, 8, rep(9,358)))
pctOfCum <- fractionals / cumsum(fractionals)
fractionals <- fractionals[pctOfCum > 0.005]

bunnies <- as.integer(fractionals / sum(fractionals) * 100) + 1
    bunnies[bunnies > 1] <- round(bunnies[bunnies > 1] * (100 -  
    sum(bunnies[bunnies == 1])) / sum(bunnies[bunnies > 1]))
if((sum(bunnies) < 100) == TRUE) bunnies[1] <- bunnies[1] + 1

> bunnies
[1] 45  6  3  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1