用另一个data.frame中的值替换data.frame中的多个字符串

时间:2018-02-02 20:27:41

标签: r string dataframe replace

我试图用另一个字符串data.frame来替换字符串data.frame中字符串的出现。

应该替换子字符串的多个基本字符串

# base strings which I want to replace
base  <- data.frame(cmd = rep("this is my example <repl1> and here second <repl2> ...", nrow(repl1)))

替换字符串

# definition of replacement strings
repl1 <- data.frame(as.character(1:10))
repl2 <- data.frame(as.character(10:1))

我尝试用lapply迭代data.frame ......

# what I have tried
lapply(base, function(x) {gsub("<repl1>", repl1, x)})

结果我不如以下......

 [1] "this is my example c(1, 3, 4, 5, 6, 7, 8, 9, 10, 2) and here second <repl2> ..."
 [2] "this is my example c(1, 3, 4, 5, 6, 7, 8, 9, 10, 2) and here second <repl2> ..."
 [3] "this is my example c(1, 3, 4, 5, 6, 7, 8, 9, 10, 2) and here second <repl2> ..."

但我想实现......

 [1] "this is my example 1 and here second 10 ..."
 [2] "this is my example 2 and here second 9 ..."
 [3] "this is my example 3 and here second 8 ..."

每个建议的Thx:)

2 个答案:

答案 0 :(得分:2)

我们可以在这里使用矢量化regmatches函数。这将删除所有循环:

首先,由于您的替换项位于不同的数据框中,请将它们组合在一起:

repl3 <- cbind(A=repl1,B=repl2)

我们还有一个问题。您创建数据框的方式,字符在类factor中。所以我将改变它:

s <- as.character(base$cmd)

从这里开始我们直接替换:

 regmatches(s,gregexpr("<repl1>|<repl2>",s))<- strsplit(do.call(paste,repl3)," ")
s
 [1] "this is my example 1 and here second 10 ..."
 [2] "this is my example 2 and here second 9 ..." 
 [3] "this is my example 3 and here second 8 ..." 
 [4] "this is my example 4 and here second 7 ..." 
 [5] "this is my example 5 and here second 6 ..." 
 [6] "this is my example 6 and here second 5 ..." 
 [7] "this is my example 7 and here second 4 ..." 
 [8] "this is my example 8 and here second 3 ..." 
 [9] "this is my example 9 and here second 2 ..." 
[10] "this is my example 10 and here second 1 ..."

您的数据中需要使用许多代码,因为每次创建数据框时,yuu都忘记使用stringsAsFactors=F选项。如果你这样做,那么代码就很简单了:

v=as.character(base$cmd)
repl4=data.frame(1:10,10:1,stringsAsFactors=F)
regmatches(v,gregexpr("<repl1>|<repl2>",v))<-data.frame(t(repl4))
v
 [1] "this is my example 1 and here second 10 ..."
 [2] "this is my example 2 and here second 9 ..." 
 [3] "this is my example 3 and here second 8 ..." 
 [4] "this is my example 4 and here second 7 ..." 
 [5] "this is my example 5 and here second 6 ..." 
 [6] "this is my example 6 and here second 5 ..." 
 [7] "this is my example 7 and here second 4 ..." 
 [8] "this is my example 8 and here second 3 ..." 
 [9] "this is my example 9 and here second 2 ..." 
[10] "this is my example 10 and here second 1 ..."

答案 1 :(得分:1)

您需要索引基础数据框和repl1数据框。您的代码将整个repl1数据帧传递给基础数据框的每一行。

试试这个:

# definition of replacement strings
repl1 <- data.frame(as.character(1:10))
repl2 <- data.frame(as.character(10:1))

# base strings which I want to replace
base  <- data.frame(cmd = rep("this is my example <repl1> and here second <repl2> ...", nrow(repl1)))

answer<-sapply(1:nrow(repl1), function(x) {gsub("<repl1>", repl1[x,1],  base[x,1])})

现在重复answer和repl2数据框

增加: 另一种方法是使用stringr库中的str_replace函数:

library(stringr)
answer<-str_replace(base[,1], "<repl1>", as.character(repl1[,1]))

这很可能比sapply方法更快。