我有一个字符串向量:
v1 <- c("Firstname LastnameFirstname Lastname",
"Firstname Lastname",
"Firstname Lastname",
"Firstname LastnameFirstname Lastname")
我想将字符串分隔成小写字母,后跟一个大写字母,同时保留两个字母。
所需的输出是:
[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
在StackExchange中的示例后,我尝试使用strsplit
函数gsub
:
unlist(strsplit( gsub("([a-z][A-Z])","\\1~",v1), "~" ))
但是这不会在>之间分割,而是在正则表达式匹配分割点之后:
[1] "Firstname LastnameF" "irstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname LastnameF" "irstname Lastname"
如何在仍保留两个字符的字符之间进行拆分?
答案 0 :(得分:6)
我们可以使用正则表达式查找来匹配小写字母(正面lookbehind - (?<=[a-z])
),然后是大写字母(正向前瞻 - (?=[A-Z])
)
unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl = TRUE))
#[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
#[4] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"