在R中的大写和小写字符之间拆分字符串?

时间:2017-04-30 12:41:24

标签: r string

我有一个字符串向量:

v1 <- c("Firstname LastnameFirstname Lastname", 
"Firstname Lastname", 
"Firstname Lastname", 
"Firstname LastnameFirstname Lastname")

我想将字符串分隔成小写字母,后跟一个大写字母,同时保留两个字母。

所需的输出是:

[1] "Firstname Lastname" "Firstname Lastname"   "Firstname Lastname"  "Firstname Lastname"  "Firstname Lastname" "Firstname Lastname"

在StackExchange中的示例后,我尝试使用strsplit函数gsub

unlist(strsplit( gsub("([a-z][A-Z])","\\1~",v1), "~" ))

但是这不会在之间分割,而是在正则表达式匹配分割点之后:

[1] "Firstname LastnameF" "irstname Lastname"   "Firstname Lastname"  "Firstname Lastname"  "Firstname LastnameF" "irstname Lastname"  

如何在仍保留两个字符的字符之间进行拆分?

1 个答案:

答案 0 :(得分:6)

我们可以使用正则表达式查找来匹配小写字母(正面lookbehind - (?<=[a-z])),然后是大写字母(正向前瞻 - (?=[A-Z])

unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl = TRUE))
#[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" 
#[4] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"