我在R中有两个关于正则表达式的相关问题:
[1]
我想将包含标点符号后跟字母的子字符串转换为大写字母
示例:
Dr_dre to: DrDre
Captain.Spock to: CaptainSpock
spider-man to: spiderMan
[2]
我想将驼峰大小写字符串转换为带有下划线分隔符的小写字符串
例如:
EndOfFile to: End_of_file
CamelCase to: Camel_Case
ABC to: A_B_C
非常感谢,
Kamashay
答案 0 :(得分:2)
我们可以使用sub
。我们匹配一个或多个标点符号([[:punct:]]+
),后跟一个作为一组((.)
)捕获的字符。在替换中,捕获组(\\1
)的反向引用更改为大写(\\U
)。
sub("[[:punct:]]+(.)", "\\U\\1", str1, perl = TRUE)
#[1] "DrDre" "CaptainSpock" "spiderMan"
对于第二种情况,我们使用正则表达式外观,即匹配一个字母((?<=[A-Za-z])
)后跟一个大写字母,并替换为_
。
gsub("(?<=[A-Za-z])(?=[A-Z])", "_", str2, perl = TRUE)
#[1] "End_Of_File" "Camel_Case" "A_B_C"
str1 <- c("Dr_dre", "Captain.Spock", "spider-man")
str2 <- c("EndOfFile", "CamelCase", "ABC")