如何比较R中的两个字符串

时间:2019-04-19 14:02:13

标签: r string

我有一个包含两个字符串变量的数据集。两者都包含我想逐字比较的句子。我想创建一个新列(“ new_var”),该列应如下所示:

var1                   var2               new_var
"sentence numer one"  "setence numer two" sentence:setence + one:two
"another one is here" "aner one are hre"  another:aner + is:are + here:hre

我不知道如何编写适用于数据集的代码:根据条件和循环添加新列。只有当我像这样定义对象var1和var2时,我的代码才能工作。

library(stringr)

var1 = "this is sentence numer one"
var2 = "this is setence numer two"


new_var <- for (i in 1:(lengths(gregexpr("\\s+", var1)) + 1)) {
  if (word(string = var1, start = i, end = i) != word(string=var2, start=i, end=i)) 
  {
    cat(word(string = var1, start = i, end = i), word(string = var2, start = i, end = i), "+", sep=":")
  } else {
    cat("")
  } 
}

1 个答案:

答案 0 :(得分:1)

一种可能性是先使用str_split软件包中的map2,然后再使用purrr

首先,我创建一些伪数据:

x <- c("sentence number one", "another one is here")
y <- c("setence number two", "aner one are hre")

然后我将其转换:

x2 <- str_split(x, " ")
y2 <- str_split(y, " ")

library(purrr)
map2(x2, y2, ~ifelse(.x == .y, "", paste(.x, .y, sep = ":")))

    [[1]]
[1] "sentence:setence" ""                 "one:two"         

[[2]]
[1] "another:aner" ""             "is:are"       "here:hre"