Question

我需要拆分一个名为C的字符对象，它看起来像：

"{TV}{Property}{Furniture}{Car or Van}{Phone}{Computer or Tablet}{Holiday}{None of the above}"

首先，我尝试使用split：

D<-strsplit(C[1], split = "}")

它有效，它让我回报：

[1] "{TV"                 "{Property"           "{Furniture"          "{Car or Van"         "{Phone"              "{Computer or Tablet" "{Holiday"           
[8] "{None of the above"

但我想摆脱其他的＆＃34; {＆＃34;。当我尝试做到这一点虽然R得到＆＃34;困惑＆＃34;用大括号

E<-unlist(strsplit(D, split="{")
Error in strsplit(D[[1]], split = "{") : invalid regular expression '{', reason 'Missing '}''

有什么建议吗？

Answer 1

你可以escape即。（\\{|\\}）或使用[{}]

 D <- strsplit(C, "[{}]")[[1]]
 D[nzchar(D)]
 #[1] "TV"                 "Property"           "Furniture"         
 #[4] "Car or Van"         "Phone"              "Computer or Tablet"
 #[7] "Holiday"            "None of the above"

或者

  strsplit(C, "\\{|}\\{|}")[[1]][-1]
  #[1] "TV"                 "Property"           "Furniture"         
  #[4] "Car or Van"         "Phone"              "Computer or Tablet"
  #[7] "Holiday"            "None of the above"

或其他选项

  regmatches(C,gregexpr("[^{}]+", C))[[1]]
  #[1] "TV"                 "Property"           "Furniture"         
  #[4] "Car or Van"         "Phone"              "Computer or Tablet"
  #[7] "Holiday"            "None of the above"

或者

  library(stringr)
  str_extract_all(C, '[^{}]+')[[1]]
  #[1] "TV"                 "Property"           "Furniture"         
  #[4] "Car or Van"         "Phone"              "Computer or Tablet"
  #[7] "Holiday"            "None of the above"

或者

  library(stringi)
  stri_extract_all_regex(C, '[^{}]+')[[1]]
  #[1] "TV"                 "Property"           "Furniture"         
  #[4] "Car or Van"         "Phone"              "Computer or Tablet"
  #[7] "Holiday"            "None of the above"

或者

  library(qdap)
  unname(bracketXtract(C, 'curly'))
  #[1] "TV"                 "Property"           "Furniture"         
  #[4] "Car or Van"         "Phone"              "Computer or Tablet"
  #[7] "Holiday"            "None of the above"

Answer 2

仅使用strsplit，您可以

strsplit(x, "[{}]+")[[1]][-1]
# [1] "TV"                 "Property"           "Furniture"         
# [4] "Car or Van"         "Phone"              "Computer or Tablet"
# [7] "Holiday"            "None of the above"

由于strsplit 的算法将匹配项左侧的字符串添加到输出中，然后删除匹配项及其左侧的所有内容，并且字符串以字符开头我们正在分裂，我们只需要删除结果的第一个元素（由[-1]显示）。

Answer 3

清理数据的另一种解决方案：

gsub("[{}]","",strsplit(C,"\\}\\{")[[1]])

[1] "TV"                 "Property"           "Furniture"          "Car or Van"        
[5] "Phone"              "Computer or Tablet" "Holiday"            "None of the above"

拆分字符对象，问题{

3 个答案: