Question

不熟悉r，不知道这是不是一个简单的问题。我想根据它们的总和创建一系列ID，这些值占总和的60％（或大约）。这是数据帧。 DF

这样我首先按ID对DF进行排序，然后检查ID值的总和达到60％的哪个范围，并将它们分组并休息，将它们分组10％，10％，10％， 10％（或者可以是随机的10％，10％，20％或5％，15％，10％，10％）。这样我的数据框看起来像

ID     Val
3-24   35           # (11+6+8+1+3+2) ~ 62% of the total sum of `Val` column
46-59  9            # (1+2+6) = 18% of the total sum of `Val` column
98     7            # (2+1+4) =14% of the total sum of `Val` column

我可以试试这个

DF=DF[with(DF, order(DF$ID)), ]
perce = round(sum(DF$ID)*60/100)
for(i in 1:dim(DF)[1]){
     if(sum(DF$Val) == perce){
      ID=which(DF$ID)
       .
       .
       .
put those ID's in a range that constitutes 60%

       }
    }

我不知道这是否可行。？

由于 DOMNICK

Answer 1

首先，我们对数据进行排序并获取每个mp.stop();组的sum。

然后我们可以使用ID来获取总计。我们需要cumsum(Val)这样，它代表“此行前所有lag - 组值的总和”。

现在，我们可以使用ID将累积总和分配给区间组cut，(-∞, 0.6 * total]和(0.7 * total, 0.8 * total]。

然后我们可以(0.8 * total, ∞)此时间间隔获得group_by sum。

Val

根据R中的值总和制作一系列ID

1 个答案: