如何在字符串中查找特定关键字并将其替换为现有字符串值?

时间:2018-03-28 16:33:36

标签: r string

我有一个像这样的数据框

ID <- c("A112","A114","A134","A116","A117","A138")
Comment <- c("Beam calibration again", "Beam calibration. Tools ready",
             "Did not find anything wrong. Beam calibration","Performed beam calibration. tool ready",
             "STD Qual and Blurry image looks fine","STD Qual failed and Slightly Blurry image")
df<- data.frame(ID,Comment)
df

    ID                                      Comment
  A112                       Beam calibration again
  A114               Beam calibration. Tools ready
  A134 Did not find anything wrong. Beam calibration
  A116       Performed beam calibration. tool ready
  A117         STD Qual and Blurry image looks fine
  A138    STD Qual failed and Slightly Blurry image

由于评论太长,我想减少它以选择特定的关键词,如“模糊图像”,“光束校准”。我希望所需的输出

    ID          Comment
  A112 Beam calibration
  A114 Beam calibration
  A134 Beam calibration
  A116 beam calibration
  A117     Blurry image
  A138     blurry image

我用这种方式对一列进行了尝试但是如何以编程方式为所有列应用类似的逻辑?

df$Comment <- gsub("Beam calibration again", "Beam calibration", df$Comment)

1 个答案:

答案 0 :(得分:2)

根据您的示例数据,如果您只是寻找Beam calibrationBlurry image,您可以这样做:

df$Comment <- gsub(".*(Beam calibration|Blurry image).*", "\\1", df$Comment, ignore.case = TRUE)

df
#    ID          Comment
#1 A112 Beam calibration
#2 A114 Beam calibration
#3 A134 Beam calibration
#4 A116 beam calibration
#5 A117     Blurry image
#6 A138     Blurry image

或者为了避免手动输入,如果你有一个关键字矢量,你可以建立你的查找:

keywords <- c("Beam calibration", "Blurry image")
lookup <- paste0(".*(", paste(keywords, collapse = "|"), ").*")
df$Comment <- gsub(lookup, "\\1", df$Comment, ignore.case = TRUE)