看起来很简单,无法找到更简单的方法。我下面有一个x
向量,需要为连续值创建组名。我的尝试是使用rle
,更好的想法?
# data
x <- c(1,1,1,2,2,2,3,2,2,1,1)
# make groups
rep(paste0("Group_", 1:length(rle(x)$lengths)), rle(x)$lengths)
# [1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4"
# [9] "Group_4" "Group_5" "Group_5"
答案 0 :(得分:11)
使用rleid
中的data.table
,
library(data.table)
paste0('Group_', rleid(x))
#[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5"
答案 1 :(得分:9)
使用diff
和cumsum
:
paste0("Group_", cumsum(c(1, diff(x) != 0)))
#[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5"
(如果您的值是浮点值,则可能必须避免!=
并改为使用容差。)
答案 2 :(得分:3)
使用cumsum但不依赖于数字数据:
Sub Test()
Dim array2(25, 25) As Double
Dim i As Integer, j As Integer
For i = 0 To UBound(array2, 1)
For j = 0 To UBound(array2, 1)
array2(i, j) = Int((Rnd * 100) + 1)
Next
Next
MsgBox WorksheetFunction.Sum(array2)
End Sub
答案 3 :(得分:2)
group()可以使用l_starts
方法从组起点列表创建组。通过将n
设置为auto
,它会自动找到分组开始:
x <- c(1,1,1,2,2,2,3,2,2,1,1)
groupdata2::group(x, n = "auto", method = "l_starts")
## # A tibble: 11 x 2
## # Groups: .groups [5]
## data .groups
## <dbl> <fct>
## 1 1 1
## 2 1 1
## 3 1 1
## 4 2 2
## 5 2 2
## 6 2 2
## 7 3 3
## 8 2 4
## 9 2 4
## 10 1 5
## 11 1 5
还有一个differs_from_previous()
函数,用于查找与前一个值相差某个阈值的值或值的索引。
# The values to start groups at
differs_from_previous(x, threshold = 1,
direction = "both")
## [1] 2 3 2 1
# The indices to start groups at
differs_from_previous(x, threshold = 1,
direction = "both",
return_index = TRUE)
## [1] 4 7 8 10