如果行中存在特定值,请将行中的所有其余值更改为新值

时间:2019-02-04 18:11:59

标签: r for-loop tidyr

因此,数据如下所示:

x1       x2   x3   x4   x5       x6          x7         x8        x9
0        0    0    1    2        Complete    Closed     Closed    Closed
0        0    0    0    0        0           0          0         Complete
0        1    2    3    Complete 1           1          1         1

我想找到一种方法,将出现“完成”值的单元格后的行中所有剩余值替换为“已关闭”。应该总是说在完成值出现后才关闭,如您所见,有几行没有遵循此逻辑,这是我需要纠正的错误。

最终数据应如下所示:

x1       x2   x3   x4   x5       x6          x7         x8        x9
0        0    0    1    2        Complete    Closed     Closed    Closed
0        0    0    0    0        0           0          0         Complete
0        1    2    3    Complete Closed      Closed     Closed    Closed

我当时想for循环可以只运行数据中的每一行,并检查值“ Complete”的存在,然后仅将该行中的剩余值更改为“ Closed”,但是我不确定语法看起来像。

1 个答案:

答案 0 :(得分:0)

为清楚起见,我将编写一个简单的实用程序函数:

close_after_complete = function(x) {
  ifelse(cumsum(x == "Complete") >= 1 & x != "Complete", "Closed", x)
}

您可以将cumsum(x == "Complete") >= 1读为“ 我们已经看到至少1个“完成” ”,当然,x != "Complete"是',但当前值不是“完成” 。当这两个条件都满足时,您想将值更改为“ closed”。

由于我们在行上进行操作,因此(隐式)转换为矩阵并使用apply是有意义的:

t(apply(df, 1, close_after_complete))
#      x1  x2  x3  x4  x5         x6         x7       x8       x9        
# [1,] "0" "0" "0" "1" "2"        "Complete" "Closed" "Closed" "Closed"  
# [2,] "0" "0" "0" "0" "0"        "0"        "0"      "0"      "Complete"
# [3,] "0" "1" "2" "3" "Complete" "Closed"   "Closed" "Closed" "Closed"  

如果需要,可以使用as.data.frame()转换回数据框。

# using this data
df = read.table(header = T, stringsAsFactors = F, text = "x1       x2   x3   x4   x5       x6          x7         x8        x9
0        0    0    1    2        Complete    Closed     Closed    Closed
0        0    0    0    0        0           0          0         Complete
0        1    2    3    Complete 1           1          1         1")