Question

如果我有如下数据框。

a <- c('A', 'b', 'c')
b <- c('b', 'c', 'A')
c <- c('c', 'A', 'b')
df <- data.frame(a, b, c)

df
  a b c
1 A b c
2 b c A
3 c A b

我想生成如下所示的其他列。基本上，df $ b_pos指定＆＃39; b＆＃39;在A＆＃39;之前或之后定位。（同样的原则适用于df $ c_pos）。

df$b_pos <- c('after A', 'before A', 'after A')
df$c_pos <- c('after A', 'before A', 'before A')

df
  a b c    b_pos    c_pos
1 A b c  after A  after A
2 b c A before A before A
3 c A b  after A before A

我想写下面的行，这样我就可以自动完成这个过程了。

df$b_pos <- ifelse(get_the_column_index_of_A > 
                     get_the_column_index_of_b, 'before A', 'after A')
df$c_pos <- ifelse(get_the_column_index_of_A > 
                     get_the_column_index_of_c, 'before A', 'after A')

如果有人能给我一个小费，而不是＆＃39; get_the_column_index_of_A＆＃39;我会非常感激。

Answer 1

我们可以使用max.col

执行此操作

df[c('b_pos', 'c_pos')] <- lapply(letters[2:3], function(x) 
         c("before A", "after A")[1+(max.col(df=="A", "first") < max.col(df==x, "first"))])
df
#  a b c    b_pos    c_pos
#1 A b c  after A  after A
#2 b c A before A before A
#3 c A b  after A before A

或者另一个选项是逐行paste数据集并使用grepl检查模式

df[c('b_pos', 'c_pos')] <- lapply(c("A.*b",  "A.*c"), function(x) 
           c("before A", "after A")[grepl(x, do.call(paste0, df))+1L])

Answer 2

使用ifelse和grep的一种方式，

df$b_pos <- ifelse(apply(df, 1, function(i) grep('A', i)) > 
                          apply(df, 1, function(i) grep('b', i)), 'before A', 'after A')

df$c_pos <- ifelse(apply(df[,1:3], 1, function(i) grep('A', i)) > 
                     apply(df[,1:3], 1, function(i) grep('c', i)), 'before A', 'after A')
df
#  a b c    b_pos    c_pos
#1 A b c  after A  after A
#2 b c A before A before A
#3 c A b  after A before A

Answer 3

如果你可以保证“A”在每一行中只出现一次，那么你可以得到一个列索引向量，如下所示：

> apply(df=="A",1,which)
[1] 1 3 2

获取R行数据框中每行包含特定字符的列号

3 个答案: