R:在负面环顾时分裂

时间:2019-07-14 09:47:19

标签: r regex-lookarounds strsplit

说我需要将town <-c('Ang Mo Kio','Sembawang','Clementi','Pasir Ris','Yishun') region <-c('North_East','North','West','East','North') flat_type <-c('3 ROOM','3 ROOM','4 ROOM','5 ROOM','4 ROOM') lat <-c(1.377804,1.450675,1.312938,1.377781,1.422404) lon <-c(103.8381 ,103.8189,103.7716, 103.9405,103.8465) qns <-data.frame(town,region,flat_type,lat,lon) 拆分成单个字母,除非字母后面跟着caabacb,从而得到b。我尝试使用以下行,在正则表达式测试器上看起来不错,但在R中不起作用。我做错了什么?

"c"  "a"  "ab"  "a"  "cb"

2 个答案:

答案 0 :(得分:3)

strsplit()可能需要拆分。您可以插入例如";"gsub()

strsplit(gsub("(?!^.|b|\\b)", ";", "caabacb", perl=TRUE), ";", perl=TRUE)
# [[1]]
# [1] "c"  "a"  "ab" "a"  "cb"

答案 1 :(得分:3)

您还可以在后面添加与任何字符(?<=.)相匹配的正向前缀。 (?<=.)后的正向外观会在每个字符处拆分字符串(不删除字符),但是负向的(?!b)会排除字符后跟b的拆分:

strsplit('caabacb', '(?<=.)(?!b)', perl = TRUE)
#> [[1]]
#> [1] "c"  "a"  "ab" "a"  "cb"