我有一个data.table
列中的一列,格式为字符串+整数。 e.g。
string1, string2, string3, string4, string5,
当我使用sort()
时,我把这些字符串放错了。
string1, string10, string11, string12, string13, ..., string2, string20,
string21, string22, string23, ....
我如何按顺序对这些进行排序
string01, string02, string03, string04, strin0g5, ... , string10,, string11,
string12, etc.
一种方法可能是为每个整数0
,<10
添加1-9
?
我怀疑你会用str_extract(dt$string_column, "[a-z]+")
提取字符串,然后为每个单位数整数添加一个0
...不知怎的sprintf()
答案 0 :(得分:6)
我们可以删除不是数字的字符来执行sort
ing
dt[order(as.integer(gsub("\\D+", "", col1)))]
答案 1 :(得分:1)
您可以使用str_extract
stringr
个包来获取数字,order
根据
x = c("string1","string3","stringZ","string2","stringX","string10")
library(stringr)
c(x[grepl("\\d+",x)][order(as.integer(str_extract(x[grepl("\\d+",x)],"\\d+")))],
sort(x[!grepl("\\d+",x)]))
#[1] "string1" "string2" "string3" "string10" "stringX" "stringZ"
答案 2 :(得分:1)
您可以在mixedsort
中找到gtools
:
vec <- c("string1", "string10", "string11", "string12", "string13","string2",
"string20", "string21", "string22", "string23")
library(gtools)
mixedsort(vec)
#[1] "string1" "string2" "string10" "string11" "string12" "string13"
# "string20" "string21" "string22" "string23"
答案 3 :(得分:1)
假设字符串如下所示:
# Read patterns into an associative array
# Requites Bash 4 or later
declare -A patterns
while IFS='|' read key value
do
patterns[$key]="$value"
done < pattern.txt
# Set the option for case insensitive patterns
shopt -s nocasematch
# IFS is set here because localised setting for 'echo' does not work in bash
oldIFS="$IFS"
IFS='|'
# "line" is an array
while read -a line
do
# Check there are at least 15 fields
if (( ${#line[@]} >= 15 ))
then
# Iterate through the patterns array
for key in "${!patterns[@]}"
do
# We are only interested in the 10th and 15th fields
# (index 9 and 14 since arrays index from zero)
val="${line[9]}"
line[9]="${val//$key/${patterns[$key]}}"
val="${line[14]}"
line[14]="${val//$key/${patterns[$key]}}"
done
fi
echo "${line[*]}"
done < master.txt
IFS="$oldIFS"
<强>输出:强>
library(data.table)
library(stringr)
xstring <- data.table(x = c("string1","string11","string2",'string10',"stringx"))
extracts <- str_extract(xstring$x,"(?<=string)(\\d*)")
y_string <- ifelse(nchar(extracts)==2 | extracts=="",extracts,paste0("0",extracts))
fin_string <- str_replace(xstring$x,"(?<=string)(\\d*)",y_string)
sort(fin_string)