Question

从向量开始，我想删除元素，以便其余元素增加。我已经有了一个迭代方法，如下所示：

test<- c(2,4,7,2,3,6,8)
while(!all(diff(test)>=0)){
    rm <- which(diff(test)<0)[1]+1
    if(!is.na(rm)) test<-test[-rm]
}

我的例子中的预期输出是（2,4,7,8）。

有更聪明的方法吗？

编辑：添加了算法的预期输出。

编辑：输出错字。

编辑：更正了我的代码，以便它提供所需的结果。

编辑：通过在末尾添加8将示例更改为更一般的示例。

Answer 1

考虑到使用您的代码获得的输出，我相信您想要删除使您的向量中的数字不增加的值。

<强> EDIT2

如果您想保留所有增加的值，请选择while：

last_val <- test[1]
out_ind <- c(1)
i <- 2
while (i<=length(test)){
    if (test[i] >= last_val) {out_ind <- c(out_ind, i); last_val <- test[i]}
    i <- i+1
}

<强> EDIT1

如果您只是想在第一次不增加号码后删除号码，您可以使用@RHertel应答或test[1:which(diff(test) < 0)[1]]来避免警告消息，如果有多个负数＆＃34;差异＆＃34;值。

首先回答代码的预期输出

这是一种方法：

# get the indexes of the sorted vector
ot <- order(test) 
# then you remove the value that doesn't correspond to increasing indexes
test <- test[-ot[which(diff(ot)<0)+1]]
>test
#[1] 2 2 3 6

Answer 2

也许这会有所帮助：

test[1:which(diff(test) < 0)]
#[1] 2 4 7

Answer 3

如果您不需要2 3 3 1 4来产生2 3 3 4（即等于不增加），您可以使用一个不错的减少

test <- c(2,4,7,2,3,1,8)
unique(Reduce(max, as.list(test), accumulate = TRUE))
 [1] 2 4 7 8

如果你确实想要重复，我相信有更好的方法可以做到这一点但是

test <- c(2,4,4,7,2,3,1,8)
reduce = Reduce(max, as.list(test), accumulate = TRUE)
df = data.frame(o = test, reduce = reduce)
df[df$o == df$reduce, "o"]
 [1] 2 4 4 7 8

将它们拉出来。

Answer 4

我会使用一个好的旧for-loop：

test <- c(2,4,7,2,3,9)

test2 <- rep(NA,length(test))
test2[1] <- test[1]
prev <- test[1]
for(i in 2:length(test)){
  if(prev < test[i]){
    test2[i] <- test[i]
    prev <- test[i]
  }
}
test2 <- test2[!is.na(test2)]

#> test2
#[1] 2 4 7 9

基准：

makeIncreasing_digEmAll <- function(test){
  test2 <- rep(NA,length(test))
  test2[1] <- test[1]
  prev <- test[1]
  for(i in 2:length(test)){
    if(prev < test[i]){
      test2[i] <- test[i]
      prev <- test[i]
    }
  }
  test2 <- test2[!is.na(test2)]
  return(test2)
}

makeIncreasing_Jcl <- function(test){
  while(!all(diff(test)>=0)){
    rm <- which(diff(test)<0)[1]+1
    if(!is.na(rm)) test<-test[-rm]
  }
  return(test)
}


set.seed(123)
test2 <- runif(n=1000,min=1,max=10000)

timeDigEmAll <- system.time(for(i in 1:200)makeIncreasing_digEmAll(test2),gcFirst=T)
timeJcl <- system.time(for(i in 1:200)makeIncreasing_Jcl(test2),gcFirst=T)

> timeDigEmAll
   user  system elapsed 
   0.17    0.00    0.17 
> timeJcl
   user  system elapsed 
  29.80    0.02   30.28

Answer 5

我要感谢为此次讨论做出贡献的所有人，我使用了digEmAll的代码进行基准测试并比较了上述所有方法。结果如下。

makeIncreasing_digEmAll <- function(test){
        test2 <- rep(NA,length(test))
        test2[1] <- test[1]
        prev <- test[1]
        for(i in 2:length(test)){
                if(prev < test[i]){
                        test2[i] <- test[i]
                        prev <- test[i]
                }
        }
        test2 <- test2[!is.na(test2)]
        return(test2)
}

makeIncreasing_Jcl <- function(test){
        while(!all(diff(test)>=0)){
                rm <- which(diff(test)<0)[1]+1
                if(!is.na(rm)) test<-test[-rm]
        }
        return(test)
}

makeIncreasing_Jcl2 <- function(test){

        return(unique(cumsum(test)))
}

makeIncreasing_CathG <- function(test){
        last_val <- test[1]
        out_ind <- c(1)
        i <- 2
        while (i<=length(test)){
                if (test[i] >= last_val) {out_ind <- c(out_ind, i); last_val <- test[i]}
                i <- i+1
        }
        return(test)
}

set.seed(123)
test2 <- runif(n=1000,min=1,max=10000)

timeDigEmAll <- system.time(for(i in 1:200)makeIncreasing_digEmAll(test2),gcFirst=T)
timeJcl <- system.time(for(i in 1:200)makeIncreasing_Jcl(test2),gcFirst=T)
timeJcl2 <- system.time(for(i in 1:200)makeIncreasing_Jcl2(test2),gcFirst=T)
timeCathG <- system.time(for(i in 1:200)makeIncreasing_CathG(test2),gcFirst=T)



> timeDigEmAll
   user  system elapsed 
  0.068   0.000   0.068 
> timeJcl
   user  system elapsed 
  14.64    0.00   14.64 
> timeJcl2
   user  system elapsed 
  0.008   0.000   0.008 
> timeCathG
   user  system elapsed 
  0.124   0.000   0.124

总之，unique(cumsum(test))是可行的方法。

创建增加的向量

5 个答案: