数据框中的R反向字符串

时间:2017-06-27 03:55:52

标签: r dataframe apply sapply

我有一个大数据框,如果它们与列ref不同,我想要反转字符串,例如,我会将GA改为AG并保持其余部分。

structure(list(number = c("rs1", "rs2", "rs3", "rs4", "rs5", 
"rs6"), ref = c("AG", "AG", "AG", "AG", "AC", "AC"), s1 = c("GA", 
"AG", "GA", "AG", "CA", "AA"), s2 = c("AA", "GG", "GA", "AA", 
"AA", "AC"), s3 = c("GG", "AG", "GG", "AA", "CC", "AC"), s4 = c("GA", 
"GG", "GA", "AA", "AA", "CC"), s5 = c("AA", "GG", "GA", "GG", 
"AA", "CC"), s6 = c("AA", "AG", "GG", "AG", "AA", "CC")), .Names = 
c("number", 
"ref", "s1", "s2", "s3", "s4", "s5", "s6"), class = "data.frame", 
row.names = c(NA, 
-6L))

Input:
number  ref s1  s2  s3  s4  s5  s6 ...
rs1 AG  GA  AA  GG  GA  AA  AA ...
rs2 AG  AG  GG  AG  GG  GG  AG ...
rs3 AG  GA  GA  GG  GA  GA  GG ...
rs4 AG  AG  AA  AA  AA  GG  AG ...
rs5 AC  CA  AA  CC  AA  AA  AA ...
rs6 AC  AA  AC  AC  CC  CC  CC ...

Desired output:
number  ref s1  s2  s3  s4  s5  s6 ...
rs1 AG  AG  AA  GG  AG  AA  AA ...
rs2 AG  AG  GG  AG  GG  GG  AG ...
rs3 AG  AG  AG  GG  AG  AG  GG ...
rs4 AG  AG  AA  AA  AA  GG  AG ...
rs5 AC  AC  AA  CC  AA  AA  AA ...
rs6 AC  AA  AC  AC  CC  CC  CC ...

我曾尝试使用library(stingi)stri_reverse函数

df.1 <- c(df[1:2],sapply(df[3:length(df)], function(x) stri_reverse[[x]]))

stri_reverse [[x]]中的错误:类型&#39;闭包的对象&#39;不是子集表格

1 个答案:

答案 0 :(得分:1)

错误来自于您尝试使用stri_reverse对函数[[进行子集(可能是拼写错误?);此外,您还需要稍微调整逻辑以获得所需内容:

library(stringi)

df[-c(1,2)] <- lapply(df[-c(1,2)], function(col) {
    rev_col = stri_reverse(col)
    ifelse(rev_col == df$ref, rev_col, col)
})

df
#  number ref s1 s2 s3 s4 s5 s6
#1    rs1  AG AG AA GG AG AA AA
#2    rs2  AG AG GG AG GG GG AG
#3    rs3  AG AG AG GG AG AG GG
#4    rs4  AG AG AA AA AA GG AG
#5    rs5  AC AC AA CC AA AA AA
#6    rs6  AC AA AC AC CC CC CC