我有几个数据框,我绑定到包含两个变量的final:“Label”和“Mean”。
标签是这种格式:
> Label Mean
>1 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (10) 18.97021
>2 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (11) 16.40476
>3 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (12) 24.79132
>4 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (13) 20.95391
>5 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (14) 19.67626
>6 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (15) 28.93776
我想根据Label中的数字组织数据,如下所示:
> Label Mean
>1 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (1) 18.97021
>2 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (2) 16.40476
>3 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (3) 24.79132
>4 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (4) 20.95391
>5 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (5) 19.67626
>6 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (6) 28.93776
有什么建议可以完成这样的事情吗? 谢谢
答案 0 :(得分:3)
使用mixedorder
中的gtools
:
df[gtools::mixedorder(df$Label),]
答案 1 :(得分:1)
这里有一个提取内部数字的解决方案"()"使用strsplit:
示例输入数据:
df<-data.frame(Label=c("C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (12)",
"C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (11)",
"C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (10)"),
Mean=c(1,2,3))
df
Label Mean
1 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (12) 1
2 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (11) 2
3 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (10) 3
排序:
df[order(as.numeric(unlist(strsplit(unlist(lapply(strsplit(as.character(df$Label),split="(",fixed=T),"[",2)),split=")")))),]
Label Mean
3 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (10) 3
2 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (11) 2
1 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (12) 1
答案 2 :(得分:1)
我首先创建一个新变量,在第一个括号后面包含所有数字,不包括它。然后我订购数据框
library(stringr)
df$label_id = as.numeric(str_exctract(df$label, '(?<=\\()\\d+'))
df = df[order(label_id),]
答案 3 :(得分:0)
这是dplyr
按Label
和变异Label
library(magrittr)
ans <- df %>%
dplyr::arrange(as.numeric(gsub(".*\\((\\d+)\\)$", "\\1", Label))) %>%
dplyr::mutate(Label = paste0(gsub("(.*)\\(\\d+\\)$", "\\1", Label), "(", row_number(), ")"))
# Label Mean
# 1 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (1) 18.97021
# 2 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (2) 16.40476
# 3 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (3) 24.79132
# 4 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (4) 20.95391
# 5 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (5) 19.67626
# 6 C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (6) 28.93776
数据
df <- read.table(text="Label,Mean
C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (10),18.97021
C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (11),16.40476
C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (12),24.79132
C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (13),20.95391
C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (14),19.67626
C2-Concatenated Stacks-1:c:2/3 - MDAMB231 (15),28.93776", header=TRUE, sep=",", stringsAsFactors=FALSE)