用gsub替换除某些字符串以外的字符

时间:2018-05-09 09:04:55

标签: r regex gsub

我正在尝试替换匹配gsub函数中的模式的列中的字符。

数据栏:

library(tidyverse)

df <- structure(list(partij_kort = c("COMBGB", "VVD", "GL", "NIEUWEL", 
"CDA")), .Names = "partij_kort", row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

  partij_kort
  <chr>      
1 COMBGB     
2 VVD        
3 GL         
4 NIEUWEL    
5 CDA 

此代码与我想要的完全相反:

df %>% mutate(new = gsub("VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL",
                         "something",
                         partij_kort))

  partij_kort new      
  <chr>       <chr>    
1 COMBGB      COMBGB   
2 VVD         something
3 GL          something
4 NIEUWEL     NIEUWEL  
5 CDA         something

我希望该模式中的每个字符串COMBGBNIEUWEL)在something中更改。

但是excamtion标记!不适用于gsub(我在grepl中使用它很多)。

期望的结果:

  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA 

最好的方法是什么?

3 个答案:

答案 0 :(得分:1)

实际上,不需要正则表达式,imo:

library(dplyr)

exceptions <- c("VVD","GL","CDA","CU","D66","PVDA","CUSGP","SGP","PVDAGL")

df %>%
  mutate(new = if_else(!(partij_kort %in% exceptions), 
                       "something", 
                       partij_kort))

这会产生

# A tibble: 5 x 2
  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA      

答案 1 :(得分:1)

你需要在gsub和正则表达式中使用perl = TRUE否定你的选择。

library(tidyverse)

df <- structure(list(partij_kort = c("COMBGB", "VVD", "GL", "NIEUWEL", "CDA", "anything", "good" ,"bad","whtever")), 
                .Names = "partij_kort", 
                row.names = c(NA, -9L), 
                class = c("tbl_df", "tbl", "data.frame"))

df %>% mutate(new = gsub("^((?!(VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL)).)*$",
                         "something", partij_kort, perl = TRUE))


# A tibble: 9 x 2
  partij_kort new      
  <chr>       <chr>    
1 COMBGB      something
2 VVD         VVD      
3 GL          GL       
4 NIEUWEL     something
5 CDA         CDA      
6 anything    something
7 good        something
8 bad         something
9 whtever     something

谢谢

答案 2 :(得分:0)

您也可以replace使用grepl,如下所示:

library(tidyverse)
df %>% mutate(new = replace(partij_kort , !grepl("VVD|GL|CDA|CU|D66|PVDA|CUSGP|SGP|PVDAGL",
                         partij_kort),"something"))


# A tibble: 5 x 2
#  partij_kort       new
#        <chr>     <chr>
#1      COMBGB something
#2         VVD       VVD
#3          GL        GL
#4     NIEUWEL something
#5         CDA       CDA