我有一个像这样的数据框
ID <- c("A112","A114","A134","A116","A117","A138")
Comment <- c("Beam calibration again", "Beam calibration. Tools ready",
"Did not find anything wrong. Beam calibration","Performed beam calibration. tool ready",
"STD Qual and Blurry image looks fine","STD Qual failed and Slightly Blurry image")
df<- data.frame(ID,Comment)
df
ID Comment
A112 Beam calibration again
A114 Beam calibration. Tools ready
A134 Did not find anything wrong. Beam calibration
A116 Performed beam calibration. tool ready
A117 STD Qual and Blurry image looks fine
A138 STD Qual failed and Slightly Blurry image
由于评论太长,我想减少它以选择特定的关键词,如“模糊图像”,“光束校准”。我希望所需的输出是
ID Comment
A112 Beam calibration
A114 Beam calibration
A134 Beam calibration
A116 beam calibration
A117 Blurry image
A138 blurry image
我用这种方式对一列进行了尝试但是如何以编程方式为所有列应用类似的逻辑?
df$Comment <- gsub("Beam calibration again", "Beam calibration", df$Comment)
答案 0 :(得分:2)
根据您的示例数据,如果您只是寻找Beam calibration
和Blurry image
,您可以这样做:
df$Comment <- gsub(".*(Beam calibration|Blurry image).*", "\\1", df$Comment, ignore.case = TRUE)
df
# ID Comment
#1 A112 Beam calibration
#2 A114 Beam calibration
#3 A134 Beam calibration
#4 A116 beam calibration
#5 A117 Blurry image
#6 A138 Blurry image
或者为了避免手动输入,如果你有一个关键字矢量,你可以建立你的查找:
keywords <- c("Beam calibration", "Blurry image")
lookup <- paste0(".*(", paste(keywords, collapse = "|"), ").*")
df$Comment <- gsub(lookup, "\\1", df$Comment, ignore.case = TRUE)