从cut中提取断点

时间:2016-04-12 18:16:08

标签: r

cut函数的文档提供了一种提取断点的方法"

aaa <- c(1,2,3,4,5,2,3,4,5,6,7)
labs <- levels(cut(aaa, 3))
cbind(lower = as.numeric( sub("\\((.+),.*", "\\1", labs) ),
      upper = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", labs) ))

#      lower upper
# [1,] 0.994  3.00
# [2,] 3.000  5.00
# [3,] 5.000  7.01

还有另一种 - 内置方法来提取断点吗?

2 个答案:

答案 0 :(得分:4)

1)read.table 我认为没有任何直接意图,但这更短:

read.table(text = gsub("[^.0-9]", " ", labs), col.names = c("lower", "upper"))

给出这个data.frame:

  lower upper
1 0.994  3.00
2 3.000  5.00
3 5.000  7.01

2)gsubfn :: strapply ,这是另一种可能性:

library(gsubfn)

strapply(labs, "[.0-9]+", as.numeric, simplify = rbind)

给出这个矩阵:

      [,1] [,2]
[1,] 0.994 3.00
[2,] 3.000 5.00
[3,] 5.000 7.01

3)gsubfn :: read.pattern 和另一个:

library(gsubfn)

read.pattern(text = labs, pattern = ".(.+),(.+).", col.names = c("lower", "upper"))

,并提供:

  lower upper
1 0.994  3.00
2 3.000  5.00
3 5.000  7.01

答案 1 :(得分:1)

以下是strsplit()的解决方案:

sapply(strsplit(labs, "\\(|,|]"), function(x) as.numeric(x[-1]))
#       [,1] [,2] [,3]
# [1,] 0.994    3 5.00
# [2,] 3.000    5 7.01