我想剪切一个范围为0-70到x个类别的值的向量,并希望每个类别的上限。到目前为止,我已经尝试使用cut()
尝试从级别中提取限制。
我有一个级别列表,我想从中提取每个级别的第二个数字。如何在 space 和]之间提取值(这是我感兴趣的数字)?
我有:
> levels(bins)
[1] "(-0.07,6.94]" "(6.94,14]" "(14,21]" "(21,28]" "(28,35]"
[6] "(35,42]" "(42,49]" "(49,56]" "(56,63.1]" "(63.1,70.1]"
并希望得到:
[1] 6.94 14 21 28 35 42 49 56 63.1 70.1
或者有更好的方法来计算类别的上限吗?
答案 0 :(得分:4)
这可能是一个解决方案
k <- sub("^.*\\,","", levels(bins))
as.numeric(substr(k,1,nchar(k)-1))
给出
[1] 6.94 14.00 21.00 28.00 35.00 42.00 49.00 56.00 63.10 70.10
答案 1 :(得分:1)
如果你想要准确的休息值,那么你应该自己计算它们,导致间隔的cut
轮次限制:
x <- seq(0,1,by=.023)
levels(cut(x, 4))
# [1] "(-0.000989,0.247]" "(0.247,0.494]" "(0.494,0.742]" "(0.742,0.99]"
levels(cut(x, 4, dig.lab=10))
# [1] "(-0.000989,0.2467555]" "(0.2467555,0.4945]" "(0.4945,0.7422445]"
# [4] "(0.7422445,0.989989]"
您可以查看代码cut.default
计算breaks
的方式:
if (length(breaks) == 1L) {
if (is.na(breaks) | breaks < 2L)
stop("invalid number of intervals")
nb <- as.integer(breaks + 1)
dx <- diff(rx <- range(x, na.rm = TRUE))
if (dx == 0)
dx <- abs(rx[1L])
breaks <- seq.int(rx[1L] - dx/1000, rx[2L] + dx/1000,
length.out = nb)
}
如此简单的方法是获取此代码并放入函数:
compute_breaks <- function(x, breaks)
if (length(breaks) == 1L) {
if (is.na(breaks) | breaks < 2L)
stop("invalid number of intervals")
nb <- as.integer(breaks + 1)
dx <- diff(rx <- range(x, na.rm = TRUE))
if (dx == 0)
dx <- abs(rx[1L])
breaks <- seq.int(rx[1L] - dx/1000, rx[2L] + dx/1000,
length.out = nb)
breaks
}
结果是
compute_breaks(x,4)
# [1] -0.000989 0.246755 0.494500 0.742244 0.989989