Question

如何使用base？

在第一个逗号上有效地分割以下字符串？

x <- "I want to split here, though I don't want to split elsewhere, even here."
strsplit(x, ???)

期望的结果（2个字符串）：

[[1]]
[1] "I want to split here"   "though I don't want to split elsewhere, even here."

提前谢谢。

编辑：没想到提到这一点。这需要能够推广到一个列，这样的字符串向量，如：

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")

结果可以是两列或一个长向量（我可以取其他所有元素）或每个索引（[[n]]）有两个字符串的stings列表。

对于缺乏清晰度表示道歉。

Answer 1

这是我可能会做的。它可能看起来很糟糕，但由于sub()和strsplit()都是矢量化的，所以当交换多个字符串时它也会顺利运行。

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"                               
# [2] "though I don't want to split elsewhere, even here."

Answer 2

来自stringr包：

str_split_fixed(x, pattern = ', ', n = 2)
#      [,1]                  
# [1,] "I want to split here"
#      [,2]                                                
# [1,] "though I don't want to split elsewhere, even here."

（这是一个有一行两列的矩阵。）

Answer 3

这是另一种解决方案，使用正则表达式捕获第一个逗号之前和之后的内容。

x <- "I want to split here, though I don't want to split elsewhere, even here."
library(stringr)
str_match(x, "^(.*?),\\s*(.*)")[,-1] 
# [1] "I want to split here"                              
# [2] "though I don't want to split elsewhere, even here."

Answer 4

library(stringr)

str_sub(x,end = min(str_locate(string=x, ',')-1))

这将获得您想要的第一位。更改start=中的end=和str_sub以获得您想要的其他内容。

如：

str_sub(x,start = min(str_locate(string=x, ',')+1 ))

并包裹str_trim以摆脱领先的空间：

str_trim(str_sub(x,start = min(str_locate(string=x, ',')+1 )))

Answer 5

这有效，但我更喜欢Josh Obrien：

y <- strsplit(x, ",")
sapply(y, function(x) data.frame(x= x[1], 
    z=paste(x[-1], collapse=",")), simplify=F))

受到追逐回应的启发。

许多人提供了非基础方法，所以我想我会添加我经常使用的方法（尽管在这种情况下我需要基本响应）：

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")
library(reshape2)
colsplit(y, ",", c("x","z"))

在字符串中的第一个逗号上拆分

5 个答案: