r用查找替换字符串的一部分

时间:2018-07-10 15:29:31

标签: r regex vector replace

我有两个数据框:

DF1:映射-两列

code | value
SDR111X | 10
DER333F | 15

DF2:数据-一列(我在脚本中添加了两列-下方)

string
AA.SDR111X AS SDR111X
AB.SDR111X AS SDR111X
DD.YRE999C AS YRE999C

目标是遍历DF1,对于每一行,查看DF2,然后将CODE的SECOND匹配替换为VALUE,这是我期望的结果:

string
AA.SDR111X AS 10
AB.SDR111X AS 10
DD.YRE999C AS YRE999C

此时,我已经开始尝试确定需求的替换部分。

之后,我将处理代码的迭代部分!

我尝试以下操作均未成功-代码运行正常,但未更改任何值:

library(tidyverse)

data <- data
data <- data %>%
  mutate(lhs = substr(X__1, 1, 14)) %>%
  mutate(rhs = substr(X__1, 15, 200))

pattern <- "SDR111X"
replacement <- "10"

str_replace_all(data$rhs, pattern, replacement)

同一件事在这里发生:

library(tidyverse)

data <- data
data <- data %>%
  mutate(lhs = substr(X__1, 1, 14)) %>%
  mutate(rhs = substr(X__1, 15, 200))

data <- data %>%
  mutate(rhs1 = replace(rhs, rhs=="SDR111X", 10))

感谢您的帮助

2 个答案:

答案 0 :(得分:2)

使用fuzzyjoin包在执行替换之前将两个数据帧连接在一起的解决方案可以解决此问题。

library(dplyr)
library(fuzzyjoin)

DF2 %>% regex_left_join(DF1, by = c("string" = "code")) %>%
  rowwise() %>%
  mutate(string = gsub(paste(code,"$",sep=""), value, string)) %>%
  select(string)

# # A tibble: 3 x 1
#        string               
#        <chr>                
# 1 AA.SDR111X AS 10     
# 2 AB.SDR111X AS 10     
# 3 DD.YRE999C AS YRE999C

数据:

DF1 <- read.table(text = 
"code  value
SDR111X  10
DER333F  15",
header = TRUE, stringsAsFactors = FALSE)


DF2 <- read.table(text = 
"string
'AA.SDR111X AS SDR111X'
'AB.SDR111X AS SDR111X'
'DD.YRE999C AS YRE999C'",
header = TRUE, stringsAsFactors = FALSE)

答案 1 :(得分:1)

这里是使用dydyverse世界的通用解决方案。

case Failure(ex)