这是我要拆分的字符串:
b[1]
[1] "County January 2016 February 2016 March 2016 April 2016 May 2016 June 2016 July 2016 August 2016 September 2016 October 2016 November 2016 December 2016\r"
从这篇文章split string with regex我认为没有现成的功能,我只想确认一下。
这是我的代码
split.pos <- gregexpr("County|([aA-zZ]{1,} [0-9]{4,})", b[1], perl = FALSE)
split.length <- attr(split.pos[[1]], "match.length")
split.start <- split.pos[[1]][1:length(split.pos[[1]])]
substring(b[1], split.start, split.start+split.length)
[1] "County " "January 2016 " "February 2016 " "March 2016 "
[5] "April 2016 " "May 2016 " "June 2016 " "July 2016 "
[9] "August 2016 " "September 2016 " "October 2016 " "November 2016 "
[13] "December 2016\r
有更好的方法吗?感谢
答案 0 :(得分:1)
我们可以将strsplit
与正则表达式一起使用
strsplit(b, "(?<=[0-9])\\s+|\\s+(?=[A-Z])", perl = TRUE)[[1]]
#[1] "County" "January 2016" "February 2016" "March 2016" "April 2016" "May 2016" "June 2016" "July 2016" "August 2016"
#[10] "September 2016" "October 2016" "November 2016" "December 2016"
b <- "County January 2016 February 2016 March 2016 April 2016 May 2016 June 2016 July 2016 August 2016 September 2016 October 2016 November 2016 December 2016\r"