我有一个带标识符的长文件,例如
A
A
B
C
A
C
我想进行分组,计数和排序操作以获取文件:
A 3
C 2
B 1
如何在CMD脚本中实现它?
答案 0 :(得分:2)
全局修改 - 所有代码均已修改为允许-
标识符。标识符不得包含!
假设标识符不包含=
或$
或!
,并且标识符不区分大小写,则以下列出按标识符排序的计数。
@echo off
setlocal enableDelayedExpansion
:: Clear any existing $ variables
for /f "delims==" %%V in ('set $ 2^>nul') do set "%%V="
:: Get a count of each identifier
for /f "usebackq delims=" %%A in ("test.txt") do (
set /a "cnt=!$%%A!+1"
set "$%%A=!cnt!"
)
:: Write the results to a new file
>output.txt (
for /f "tokens=1,2 delims=$=" %%A in ('set $') do echo %%A %%B
)
:: Show the result
type output.txt
可以根据需要调整前缀。但是,如果标识符区分大小写,则无法使用此技术。
修改强>
这是一个按计数降序对结果进行排序的版本
@echo off
setlocal enableDelayedExpansion
:: Clear any existing $ variables
for /f "delims==" %%V in ('set $ 2^>nul') do set "%%V="
:: Get a count of each identifier
for /f "usebackq delims=" %%A in ("test.txt") do (
set /a "cnt=!$%%A!+1"
set "$%%A=!cnt!"
)
:: Write a temp file with zero padded counts prefixed to the left.
>temp.txt (
for /f "tokens=1,2 delims=$=" %%A in ('set $') do (
set "cnt=000000000000%%B"
echo !cnt:~-12!=%%A=%%B
)
)
:: Sort and write the results to a new file
>output.txt (
for /f "tokens=2,3 delims=$=" %%A in ('sort /r temp.txt') do echo %%A %%B
)
del "temp.txt"
:: Show the result
type output.txt
编辑2
这是另一个按计数递减排序的选项,假设REPL.BAT位于PATH中的某个位置
@echo off
setlocal enableDelayedExpansion
:: Clear any existing $ variables
for /f "delims==" %%V in ('set $ 2^>nul') do set "%%V="
:: Get a count of each identifier
for /f "usebackq delims=" %%A in ("test.txt") do (
set /a "cnt=!$%%A!+1"
set "$%%A=!cnt!"
)
:: Sort result by count descending and write to output file
set $|repl "\$(.*)=(.*)" "000000000000$2=$1 $2"|repl ".*(.{12}=.*)" $1|sort /r|repl ".{13}(.*)" $1 >output.txt
:: Show the result
type output.txt