Question

我在R中有一个数据框diff，如下所示：

> head(diff)
        svd next_svd      delta Component Selected?
1 286178.77 54993.91 4.20382638                TRUE
2  54993.91 25194.36 1.18278651                TRUE
3  25194.36 19643.22 0.28259842               FALSE
4  19643.22 15580.04 0.26079366               FALSE
5  15580.04 14549.86 0.07080343               FALSE
6  14549.86 11496.45 0.26559570               FALSE

我想检查Component Selected?的值，并根据此检查更新值。基本上，如果上面一行中的条目是FALSE，我想声明当前条目更改为FALSE。

我使用以下代码实现了预期的效果：

  for(row in 2:nrow(diff)){
    if(!diff$`Component Selected?`[row-1]){
      diff$`Component Selected?`[row]<- FALSE
    }
  }

为了加速我的代码并避免R中的for循环，正如我曾经被告知的那样，我想知道是否有人有解决方案来实现这一点而没有for循环会增加速度。

谢谢！

Answer 1

一种解决方案可能是使用lag。使用NA函数检查上一个值。如果之前的值为TRUE或FALSE，则不要更改当前行中的值。否则将其更改为library(dplyr) df <- read.table( text = "svd next_svd delta Component_Selected 286178.77 54993.91 4.20382638 TRUE 54993.91 25194.36 1.18278651 TRUE 25194.36 19643.22 0.28259842 FALSE 19643.22 15580.04 0.26079366 FALSE 15580.04 14549.86 0.07080343 FALSE 14549.86 11496.45 0.26559570 FALSE", header = T, stringsAsFactors = F) df %>% mutate(Component_Selected = ifelse(is.na(lag(Component_Selected)) | lag(Component_Selected) ,Component_Selected, FALSE)) svd next_svd delta Component_Selected 1 286178.77 54993.91 4.20382638 TRUE 2 54993.91 25194.36 1.18278651 TRUE 3 25194.36 19643.22 0.28259842 FALSE 4 19643.22 15580.04 0.26079366 FALSE 5 15580.04 14549.86 0.07080343 FALSE 6 14549.86 11496.45 0.26559570 FALS。

.Object

Answer 2

在R base中，你可以这样做：

df$Component_Selected[setdiff(which(!df$Component_Selected), nrow(df)) + 1] <- FALSE

所以，基本上，

您获得了列为FALSE的索引，
删除最后一行的索引（如果存在），
您在所有这些索引中添加一个，
最后，您为这些新索引分配值FALSE。

R：根据数据框中的先前条目进行更新

2 个答案: