如何在R中的字符串之前放置字符或分隔符

时间:2016-08-23 06:59:27

标签: r gsub strsplit

我有一个字符串:

"Father’s Name : ABC NaskarDate of Birth : 18-01-1979Permanent Address: This is the address field for the personContact Numbers : 98413***28Passport Number:PAN Number: AEFXXXXXXXLanguages Known: Tamil, English"

我想要的输出是:

"|||Father’s Name : ABC Naskar|||Date of Birth : 18-01-1979|||Permanent Address: This is the address field for the person|||Contact Numbers : 98413***28|||Passport Number:|||PAN Number: AEFXXXXXXX|||Languages Known: Tamil, English"

这意味着我要添加" |||"在一些特定的字符串之前,比如父亲的名字, 出生日期等等。谢谢

2 个答案:

答案 0 :(得分:1)

我们无法找到一般模式,但根据显示的字符串,似乎|||分隔符应位于字符串的开头(^),无论有没有小写字母后跟大写字母或数字后跟大写字母,也在PAN之前以及XXXXLanguages之间。在这种情况下,应该使用正则表达式的外观。

gsub("(?<=[a-z0-9])(?=[A-Z])|^|(?<=[XXX])(?=Lang)|(?=PAN)", "|||", str1, perl = TRUE)
#[1] "|||Father’s Name : ABC Naskar|||Date of Birth : 18-01-1979|||Permanent Address: This is the address field for the person|||Contact Numbers : 98413***28|||Passport Number:|||PAN Number: AEFXXXXXXX|||Languages Known: Tamil, English"

数据

str1 <- "Father’s Name : ABC NaskarDate of Birth : 18-01-1979Permanent Address: This is the address field for the personContact Numbers : 98413***28Passport Number:PAN Number: AEFXXXXXXXLanguages Known: Tamil, English"

答案 1 :(得分:1)

gsub("(Father’s Name)|(Date of Birth)", "|||\\1\\2", x)
[1] "|||Father’s Name : ABC Naskar|||Date of Birth : 18-01-1979Permanent Address: This is the address field for the personContact Numbers : 98413***28Passport Number:PAN Number: AEFXXXXXXXLanguages Known: Tamil, English"

幸运的是,正则表达式OR,"|"能够在第二种模式的替换中分布。