Question

我有一个表，其中包含18个变量，其中包含对业务中的工作流程项的注释（更新）。这些变量名为 comment_0 至 comment_17 。

每次添加新评论时，它都会插入到相应行的最高空白处（即，如果以前有2条评论，则下一条评论会添加到 comment_2 列下）。

我需要创建一个新列，为每行复制最新评论。该列的内容已在下面的 new_column 下的数据中进行了模拟。

数据：

df1 <- read.table(text = "comment_0   comment_1   comment_2   comment_3   comment_4   comment_5   new_column
NA  NA  NA  NA  NA  NA  NA
           text0   text1   text2   text3   text4   text5   text5
           NA  NA  NA  NA  NA  NA  NA
           text0   NA  NA  NA  NA  NA  text0
           NA  NA  NA  NA  NA  NA  NA
           NA  NA  NA  NA  NA  NA  NA
           text0   NA  NA  NA  NA  NA  text0
           text0   text1   text2   NA  NA  NA  text2
           text0   NA  NA  NA  NA  NA  text0
           text0   NA  NA  NA  NA  NA  text0", header = TRUE, stringsAsFactors = FALSE)

Answer 1

无需使用循环，我们可以将max.col与ties.method = "last"一起使用，以获取每一行中最后一个非NA条目的列索引，使用cbind创建row-col配对，然后对数据帧进行子集化。

df$new_column <- df[cbind(1:nrow(df), max.col(!is.na(df), ties.method = "last"))]

df
#   comment_0 comment_1 comment_2 comment_3 comment_4 comment_5 new_column
#1       <NA>      <NA>      <NA>      <NA>      <NA>      <NA>       <NA>
#2      text0     text1     text2     text3     text4     text5      text5
#3       <NA>      <NA>      <NA>      <NA>      <NA>      <NA>       <NA>
#4      text0      <NA>      <NA>      <NA>      <NA>      <NA>      text0
#5       <NA>      <NA>      <NA>      <NA>      <NA>      <NA>       <NA>
#6       <NA>      <NA>      <NA>      <NA>      <NA>      <NA>       <NA>
#7      text0      <NA>      <NA>      <NA>      <NA>      <NA>      text0
#8      text0     text1     text2      <NA>      <NA>      <NA>      text2
#9      text0      <NA>      <NA>      <NA>      <NA>      <NA>      text0
#10     text0      <NA>      <NA>      <NA>      <NA>      <NA>      text0

我们还可以按行（与apply一起使用max.col（当您可以MARGIN = 1时不建议使用），并获取每行中的最后一个非NA值。

df$new_column <- apply(df, 1, function(x)  x[which.max(cumsum(!is.na(x)))])

Answer 2

反转数据框，然后使用 dplyr :: coalesce ：

获得第一个非NA值

library(dplyr)

coalesce(!!!df1[, 6:1])
# [1] NA      "text5" NA      "text0" NA      NA      "text0" "text2" "text0" "text0"

# test
identical(df1$new_column, coalesce(!!!df1[, 6:1]))
# [1] TRUE

嵌套循环以查找最新评论

2 个答案: