根据有序的字符向量

时间:2017-03-24 14:31:22

标签: r string vector dataframe filter

不确定我的问题是否重复,但在stackoverflow中搜索并未产生任何可能的解决方案。

我有以下数据框

num   char  
1     A  
2     K  
3     I  
4     B  
5     I  
6     N  
7     G  
8     O  
9     Z  
10    Q 

我想只在char列中选择那些形成单词BINGO(按此顺序)的行,从而产生以下数据帧:

num char  
4     B  
5     I  
6     N  
7     G  
8     O 

非常感谢任何帮助。

5 个答案:

答案 0 :(得分:3)

一种选择是使用zoo::rollapply

library(zoo)
bingo = c("B", "I", "N", "G", "O")    # the pattern you want to check

# use rollapply to check if the pattern exists in any window
index = which(rollapply(df$char, length(bingo), function(x) all(x == bingo)))

# extract the window from the table
df[mapply(`:`, index, index + length(bingo) - 1),]

#  num char
#4   4    B
#5   5    I
#6   6    N
#7   7    G
#8   8    O

答案 1 :(得分:1)

这是一个使用递归函数的解决方案 - BINGO的字母不需要是连续的,但它们确实需要按顺序排列。

df <- data.frame(num=1:10,char=c("A","K","I","B","I","N","G","O","Z","Q"),stringsAsFactors = FALSE)

word<-"BINGO"

chars<-strsplit(word,"")[[1]]

findword <- function(chars,df,a=integer(0),m=0){ #a holds the result so far on recursion, m is the position to start searching
  z <- m+match(chars[1],df$char[(m+1):nrow(df)]) #next match of next letter
  if(!is.na(z)){      
    if(length(chars)==1){
      a <- c(z,a)
    } else {
      a <- c(z,Recall(chars[-1],df,a,max(m,z))) #Recall is function referring to itself recursively
    }
    return(a) #returns row index numbers of df
  } else {
    return(NA)
  }
}

result <- df[findword(chars,df),]

答案 2 :(得分:0)

d = data.frame(num=1:15, char=c('A', 'K', 'I', 'B', 'I', 'N', 'G', 'O', 'Z', 'Q', 'B', 'I', 'N', 'G', 'O'))
w = "BINGO"
N = nchar(w)
char_str = paste(d$char, sep='', collapse='')

idx = as.integer(gregexpr(w, char_str)[[1]]) 
idx = as.integer(sapply(idx, function(i)seq(i, length=N)))
d[idx, ]

   num char
4    4    B
5    5    I
6    6    N
7    7    G
8    8    O
11  11    B
12  12    I
13  13    N
14  14    G
15  15    O

答案 3 :(得分:0)

我猜没有人喜欢循环,但这可能是基础:

Application.Transpose

答案 4 :(得分:0)

我第一次跑得太快,但根据你给出的例子,我认为这可行:

filter(df[which(df$char == "B"):dim(df)[1],], char %in% c("B","I","N","G","O"))