在字符上拆分字符串向量以返回矩阵

时间:2014-10-22 18:10:55

标签: regex r string

我有

rownames(results.summary)
[1] "2 - 1" "3 - 1" "4 - 1"

我想要的是返回

的矩阵
2  1
3  1
4  1

我的方式是:

for(i in 1:length(rownames(results.summary)){
  current.split <- unlist(strsplit(rownames(results.summary)[i], "-"))
  matrix.results$comparison.group[i] <- trim(current.split[1])
  matrix.results$control.group[i] <- trim(current.split[2])
}

trim函数基本上删除了两端的任何空格。

我一直在学习正则表达式,并想知道是否有更优雅的矢量化解决方案?

5 个答案:

答案 0 :(得分:6)

无需使用strsplit,只需使用read.table

阅读
 read.table(text=vec,sep='-',strip.white = TRUE) ## see @flodel comment
  V1 V2
1  2  1
2  3  1
3  4  1

其中vec是:

vec <-  c("2 - 1", "3 - 1", "4 - 1")

答案 1 :(得分:3)

这应该有效:

vv <- c("2 - 1", "3 - 1", "4 - 1")
matrix(as.numeric(unlist(strsplit(vv, " - "))), ncol = 2, byrow = TRUE)
#      [,1] [,2]
# [1,]    2    1
# [2,]    3    1
# [3,]    4    1

答案 2 :(得分:3)

您还可以尝试scan

vec <-  c("2 - 1", "3 - 1", "4 - 1")
s <- scan(text = vec, what = integer(), sep = "-", quiet = TRUE)
matrix(s, length(s)/2, byrow = TRUE)
#      [,1] [,2]
# [1,]    2    1
# [2,]    3    1
# [3,]    4    1

另一个选项是cSplit

library(splitstackshape)
cSplit(data.frame(vec), "vec", sep = " - ", fixed=TRUE)
#    vec_1 vec_2
# 1:     2     1
# 2:     3     1
# 3:     4     1

答案 3 :(得分:2)

您可以使用str_match包中的stringr

library(stringr)
##
x <- c("2 - 1","3 - 1","4 - 1")
##
cmat <- str_match(x, "(\\d).+(\\d)")[,-1]
> apply(cmat,2,as.numeric)
     [,1] [,2]
[1,]    2    1
[2,]    3    1
[3,]    4    1

答案 4 :(得分:2)

使用reshape2 colsplit

library(reshape2)
colsplit(x, " - ",  c("A", "B"))
#   A B
# 1 2 1
# 2 3 1
# 3 4 1

或使用tidyr s separate

library(tidyr)
separate(data.frame(x), x, c("A", "B"), sep = " - ")
#   A B
# 1 2 1
# 2 3 1
# 3 4 1