cut() - 包括最低值

时间:2012-09-03 09:25:48

标签: r

我想使用cut()中定义的中断来剪切数据:

x = c(-10:10)

cut(x, c(-2,4,6,7))

[1] <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   (-2,4] (-2,4] (-2,4] (-2,4] (-2,4] (-2,4] (4,6]  (4,6]  (6,7]  <NA>   <NA>  
[21] <NA>  
Levels: (-2,4] (4,6] (6,7]

但是,我还希望获得级别(minimum:-2](7:maximum]。在汽车包的函数recode()中,可以使用“lo:”。是否有类似的东西可用于切割?

6 个答案:

答案 0 :(得分:8)

x <- -10:10

cut(x, c(-Inf, -2, 4, 6, 7, +Inf))

# Levels: (-Inf,-2] (-2,4] (4,6] (6,7] (7, Inf]

答案 1 :(得分:5)

您可以使用min()max()评估区间范围(如Gavin所述)并设置include.lowest = TRUE以确保最小值(此处为-10)是间隔。

输入:

x = c(-10:10)

cut(x, c(min(x),-2,4,6,7,max(x)), include.lowest = TRUE)

输出:

 [1] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] (-2,4]  
[11] (-2,4]   (-2,4]   (-2,4]   (-2,4]   (-2,4]   (4,6]    (4,6]    (6,7]    (7,10]   (7,10]  
[21] (7,10]  
Levels: [-10,-2] (-2,4] (4,6] (6,7] (7,10]

答案 2 :(得分:4)

findInterval就是答案。

i <- findInterval(x, c(-2,4,6,7))

cbind(x, i)

        x i
 [1,] -10 0
 [2,]  -9 0
 [3,]  -8 0
 [4,]  -7 0
 [5,]  -6 0
 [6,]  -5 0
 [7,]  -4 0
 [8,]  -3 0
 [9,]  -2 1
[10,]  -1 1
[11,]   0 1
[12,]   1 1
[13,]   2 1
[14,]   3 1
[15,]   4 2
[16,]   5 2
[17,]   6 3
[18,]   7 4
[19,]   8 4
[20,]   9 4
[21,]  10 4

答案 3 :(得分:2)

我遇到了Inf&amp; -Inf之前{虽然正是为什么在这个时刻逃脱了我)所以一个更安全的解决方案可能是填充适当延长的最小值和最大值:

x <- c(-10:10)
cut(x, c(min(x) -1 , -2, 4, 6, 7, max(x) + 1))

R> x <- c(-10:10)
R> cut(x, c(min(x) -1 , -2, 4, 6, 7, max(x) + 1))
 [1] (-11,-2] (-11,-2] (-11,-2] (-11,-2] (-11,-2] (-11,-2] (-11,-2] (-11,-2]
 [9] (-11,-2] (-2,4]   (-2,4]   (-2,4]   (-2,4]   (-2,4]   (-2,4]   (4,6]   
[17] (4,6]    (6,7]    (7,11]   (7,11]   (7,11]  
Levels: (-11,-2] (-2,4] (4,6] (6,7] (7,11]

在大多数情况下,Sven的答案/解决方案就足够了。

答案 4 :(得分:0)

我们还可以使用smart_cut软件包中的cutr

# devtools::install_github("moodymudskipper/cutr")
library(cutr)

x <- -10:10
smart_cut(x, c(-2, 4, 6, 7), closed="right")

#  [1] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] [-10,-2] (-2,4]   (-2,4]   (-2,4]   (-2,4]   (-2,4]  
# [15] (-2,4]   (4,6]    (4,6]    7        (7,10]   (7,10]   (7,10]  
# Levels: [-10,-2] < (-2,4] < (4,6] < 7 < (7,10]

它的expand参数默认情况下设置为TRUE,将其设置为FALSE使其像base::cut一样工作。

more on cutr and smart_cut

答案 5 :(得分:0)

(My) santoku 包在必要时自动延长休息时间:

library(santoku)
x <- c(-10:10)

chop(x, c(-2, 4, 6, 7))

##  [1] [-10, -2) [-10, -2) [-10, -2) [-10, -2) [-10, -2) [-10, -2) [-10, -2) [-10, -2) [-2, 4)  
## [10] [-2, 4)   [-2, 4)   [-2, 4)   [-2, 4)   [-2, 4)   [4, 6)    [4, 6)    [6, 7)    [7, 10]  
## [19] [7, 10]   [7, 10]   [7, 10]  
## Levels: [-10, -2) [-2, 4) [4, 6) [6, 7) [7, 10]

您可以使用 extendchop() 参数控制此行为。