我需要拆分一个名为C的字符对象,它看起来像:
"{TV}{Property}{Furniture}{Car or Van}{Phone}{Computer or Tablet}{Holiday}{None of the above}"
首先,我尝试使用split:
D<-strsplit(C[1], split = "}")
它有效,它让我回报:
[1] "{TV" "{Property" "{Furniture" "{Car or Van" "{Phone" "{Computer or Tablet" "{Holiday"
[8] "{None of the above"
但我想摆脱其他的&#34; {&#34;。当我尝试做到这一点虽然R得到&#34;困惑&#34;用大括号
E<-unlist(strsplit(D, split="{")
Error in strsplit(D[[1]], split = "{") : invalid regular expression '{', reason 'Missing '}''
有什么建议吗?
答案 0 :(得分:3)
你可以escape
即。 (\\{|\\}
)或使用[{}]
D <- strsplit(C, "[{}]")[[1]]
D[nzchar(D)]
#[1] "TV" "Property" "Furniture"
#[4] "Car or Van" "Phone" "Computer or Tablet"
#[7] "Holiday" "None of the above"
或者
strsplit(C, "\\{|}\\{|}")[[1]][-1]
#[1] "TV" "Property" "Furniture"
#[4] "Car or Van" "Phone" "Computer or Tablet"
#[7] "Holiday" "None of the above"
或其他选项
regmatches(C,gregexpr("[^{}]+", C))[[1]]
#[1] "TV" "Property" "Furniture"
#[4] "Car or Van" "Phone" "Computer or Tablet"
#[7] "Holiday" "None of the above"
或者
library(stringr)
str_extract_all(C, '[^{}]+')[[1]]
#[1] "TV" "Property" "Furniture"
#[4] "Car or Van" "Phone" "Computer or Tablet"
#[7] "Holiday" "None of the above"
或者
library(stringi)
stri_extract_all_regex(C, '[^{}]+')[[1]]
#[1] "TV" "Property" "Furniture"
#[4] "Car or Van" "Phone" "Computer or Tablet"
#[7] "Holiday" "None of the above"
或者
library(qdap)
unname(bracketXtract(C, 'curly'))
#[1] "TV" "Property" "Furniture"
#[4] "Car or Van" "Phone" "Computer or Tablet"
#[7] "Holiday" "None of the above"
答案 1 :(得分:3)
仅使用strsplit
,您可以
strsplit(x, "[{}]+")[[1]][-1]
# [1] "TV" "Property" "Furniture"
# [4] "Car or Van" "Phone" "Computer or Tablet"
# [7] "Holiday" "None of the above"
由于strsplit
的算法将匹配项左侧的字符串添加到输出中,然后删除匹配项及其左侧的所有内容,并且字符串以字符开头我们正在分裂,我们只需要删除结果的第一个元素(由[-1]
显示)。
答案 2 :(得分:0)
清理数据的另一种解决方案:
gsub("[{}]","",strsplit(C,"\\}\\{")[[1]])
[1] "TV" "Property" "Furniture" "Car or Van"
[5] "Phone" "Computer or Tablet" "Holiday" "None of the above"