似乎是一个非常基本的问题,但我无法找到一种“简单”的方法。
我想对包含semantic version numbers <{>>基本R功能的character
向量进行排序:
vsns <- c("1", "10", "1.1", "1.10", "1.2", "1.1.1",
"1.1.10", "1.1.2", "1.1.1.1", "1.1.1.10", "1.1.1.2")
排序后应如下所示:
# [1] "1" "1.1" "1.1.1" "1.1.1.1" "1.1.1.2" "1.1.1.10"
# [7] "1.1.2" "1.1.10" "1.2" "1.10" "10"
这不能得到我想要的东西,因为R只是按字母顺序对整个事情进行排序:
sort(vsns)
# [1] "1" "1.1" "1.1.1" "1.1.1.1" "1.1.1.10" "1.1.1.2" "1.1.10"
# [8] "1.1.2" "1.10" "1.2" "10"
vsns[order(vsns)]
# [1] "1" "1.1" "1.1.1" "1.1.1.1" "1.1.1.10" "1.1.1.2" "1.1.10"
# [8] "1.1.2" "1.10" "1.2" "10"
尝试规范化它(有点沿着这个post),但我想不出一个适合语义版本结构的匹配/替换方案:
tmp <- gsub("\\.", "", vsns)
# [1] "011" "021" "0101" "0201"
tmp_nchar <- sapply(tmp, nchar)
to_add <- max(tmp_nchar) - tmp_nchar
tmp <- sapply(1:length(tmp), function(ii) {
paste0(tmp[ii], paste(rep("A", to_add[ii]), collapse = ""))
})
# [1] "10" "1.10" "1.1.10" "1.1.1.10" "1.1.1.1" "1.1.1.2" "1.1.1"
# [8] "1.1.2" "1.1" "1.2" "1"
vsns[order(tmp)]
# [1] "1AAAA" "10AAA" "11AAA" "110AA" "12AAA" "111AA" "1110A" "112AA" "1111A" "11110"
# [11] "1112A"
到目前为止我能想出的最好的是这个,但看起来很漂亮......参与其中; - )
sortVersionNumbers <- function(x, decreasing = FALSE) {
tmp <- strsplit(x, split = "\\.")
tmp_l <- sapply(tmp, length)
idx_max <- which.max(tmp_l)[1]
tmp_l_max <- tmp_l[idx_max]
tmp_n <- lapply(tmp, function(ii) {
ii_l <- length(ii)
if (ii_l < tmp_l_max) {
c(ii, rep(NA, (tmp_l_max - ii_l)))
} else {
ii
}
})
tmp <- matrix(as.numeric(unlist(tmp_n)), nrow = length(tmp_n), byrow = TRUE)
tmp_cols <- ncol(tmp)
expr <- paste0("order(", paste(paste0("tmp[,", 1:tmp_cols, "]"),
collapse = ", "), ", na.last = FALSE",
ifelse(decreasing, ", decreasing = FALSE)", ")"))
idx <- eval(parse(text = expr))
tmp_2 <- tmp[idx,]
sapply(1:nrow(tmp_2), function(ii) {
paste(na.omit(tmp_2[ii,]), collapse = ".")
})
}
sortVersionNumbers(vsns)
# [1] "1" "1.1" "1.1.1" "1.1.1.1" "1.1.1.2" "1.1.1.10" "1.1.2"
# [8] "1.1.10" "1.2" "1.10" "10"
sortVersionNumbers(sort(vsns))
# [1] "1" "1.1" "1.1.1" "1.1.1.1" "1.1.1.2" "1.1.1.10" "1.1.2"
# [8] "1.1.10" "1.2" "1.10" "10"
答案 0 :(得分:6)
来自?numeric_version
> sort(numeric_version(vsns))
[1] '1' '1.1' '1.1.1' '1.1.1.1' '1.1.1.2' '1.1.1.10'
[7] '1.1.2' '1.1.10' '1.2' '1.10' '10'
看看如何实现它是相对有趣的。 numeric_version
将单个版本字符串拆分为整数部分,并将版本矢量存储为整数向量列表。 xtfrm
上的方法(由sort()
使用)将构成每个版本字符串的整数向量转换为数值,其中的内容为
base <- max(unlist(x), 0, na.rm = TRUE) + 1
x <- vapply(x, function(t) sum(t/base^seq.int(0, length.out = length(t))),
1)
结果是一个数字向量,可用于以标准方式对原始向量进行排序。因此,临时解决方案是
xtfrm.my_version <- function(x) {
x <- lapply(strsplit(x, ".", fixed=TRUE), as.integer)
base <- max(unlist(x), 0, na.rm = TRUE) + 1
vapply(x, function(t) sum(t/base^seq.int(0, length.out = length(t))), 1)
}
vsns <- c("1", "10", "1.1", "1.10", "1.2", "1.1.1",
"1.1.10", "1.1.2", "1.1.1.1", "1.1.1.10", "1.1.1.2")
class(vsns) = "my_version"
sort(vsns)
答案 1 :(得分:2)
这是否有效
vsns <- c("1", "10", "1.1", "1.10", "1.2", "1.1.1",
"1.1.10", "1.1.2", "1.1.1.1", "1.1.1.10", "1.1.1.2")
x <- strsplit(vsns, "\\.")
max.length <- max(sapply(x, function(i) max(nchar(i))))
y <- lapply(x, function(i) sprintf(as.numeric(i), fmt = paste0("%0", max.length, "d")))
y <- sapply(y, paste, collapse = ".")
vsns[order(y)]
# [1] "1" "1.1" "1.1.1" "1.1.1.1" "1.1.1.2" "1.1.1.10"
# [7] "1.1.2" "1.1.10" "1.2" "1.10" "10"
答案 2 :(得分:1)
尝试:
ll = strsplit(vsns,'\\.')
dd = data.frame(t(sapply(ll, c)))
dd = data.frame(apply(dd, 2, function(x) as.numeric(as.character(x))))
dd = with(dd, dd[order(X1,X2,X3),])
ans = apply(dd, 1, paste, collapse=".")
ans
1 2 3 4
"0.1.1" "0.2.1" "0.10.1" "0.20.1"
答案 3 :(得分:1)
尝试使用新的vsns数据:
vsns <- c("1", "10", "1.1", "1.10", "1.2", "1.1.1", "1.1.10", "1.1.2", "1.1.1.1", "1.1.1.10", "1.1.1.2")
dd = data.frame(vsns)
library(splitstackshape)
dd2 = concat.split.expanded(dd, 'vsns', '.', fill = 0, drop = TRUE)
dd3 = cbind(dd, dd2)
dd4= with(dd3, dd3[order(vsns_1, vsns_2, vsns_3, vsns_4),])
dd4[is.na(dd4)]=0
dd4
vsns vsns_1 vsns_2 vsns_3 vsns_4
9 1.1.1.1 1 1 1 1
11 1.1.1.2 1 1 1 2
10 1.1.1.10 1 1 1 10
6 1.1.1 1 1 1 0
8 1.1.2 1 1 2 0
7 1.1.10 1 1 10 0
3 1.1 1 1 0 0
5 1.2 1 2 0 0
4 1.10 1 10 0 0
1 1 1 0 0 0
2 10 10 0 0 0
>
apply(dd4[,2:5], 1, paste, collapse='.')
9 11 10 6 8 7 3 5 4 1
" 1.1.1.1" " 1.1.1.2" " 1.1.1.10" " 1.1.1.0" " 1.1.2.0" " 1.1.10.0" " 1.1.0.0" " 1.2.0.0" " 1.10.0.0" " 1.0.0.0"
2
"10.0.0.0"
答案 4 :(得分:0)
这是一个解决方案,它推广了具有不同数量的块(缩进的sapply + ifelse
行)的版本号,并且可以处理混合的数字和字母(mixedsort
行)。
library(gtools)
vsns <- c("0.1.1", "0.10", "0.2.1", "0.2.1a", "0.20", "0.20.1.3")
v <- strsplit(vsns, "\\.")
tmp <- data.frame(sapply(1:max(sapply(v, length)), function(i){
vv <- sapply(v, "[", i)
ifelse(is.na(vv), "0", vv)
}), stringsAsFactors=FALSE)
vsns[do.call(mixedorder, tmp)]
[1] "0.1.1" "0.2.1" "0.2.1a" "0.10" "0.20" "0.20.1.3"