R,str_replace,gsub,如何将一个字符向量替换为另一个字符向量?

时间:2019-07-11 21:19:09

标签: r string character

我正在尝试从几行字符串中“删除”特定字符。

我能够从该列中提取要“删除”的特定字符,但无法递归地将它们替换为“”。

我尝试了mapvaluesgsubstr_replace的某些选项,但是我没有运气

#Example data   
test_col<-data.frame(sequence=c("ATGCRYSW\n",
                                   "ATGCRYSW\\n",
                                   "ATGCRYSW\r\n",
                                   "ATGCRYSW\r\nATGCRYSW",
                                   "ATGCRYSW"),
                                   stringsAsFactors = FALSE)


#vector of allowed characters in strings
permitted_seq_chars<-c("A","C","G","T","R","Y","S","W","K",
                       "M","B","D","H","V","N","+","-","X")



#get all the unique characters in column of interest
all_unique_source_seq_chars<-unique(unlist(strsplit(test_col[["sequence"]],
                                     split ="")))


#subset invalid characters
all_unique_source_seq_invalid_chars<-setdiff(all_unique_source_seq_chars,
                                             permitted_seq_chars )

#'delete' invalid characters one by one. So far the only way I've been able to 
# do so, but i would like to not depend on fixed variables if new ones arise  
# in the future

str_replace_all(test_col$sequence, c( "\n"= "",
                                       "\\"="",
                                       "n"=""))

有什么方法可以仅通过查看all_unique_source_seq_invalid_chars来递归地做到这一点吗?

1 个答案:

答案 0 :(得分:2)

一种选择是将paste的各个字符作为由方括号括起来的模式字符串进行字面值评估(如果有元字符),然后在{中将其替换为空白("") {1}}

gsub