如何使用批处理文件和Windows命令解释器提取以HDR
开头,后跟搜索关键字的文件部分?
只应将某些HDR复制到另一个名为GoodHDR.txt
的文件中
搜索中未包含的HDR也应复制到名为BadHDR.txt
的其他文件中。
例如,我在HeaderList.txt
下方,需要获得HEADER0001
和HEADER0003
部分。
HDRHEADER0001 X004010850P
BEG00SAD202659801032017021699CANE
HDRHEADER0002 X004010850P
BEG00SAD202611701012017021499CANW
DTM01020170214
N1ST 92 0642397236
N315829 RUE BELLERIVE
N4MONTREAL QCH1A5A6 CANADA
HDRHEADER0003 X004010850P
BEG00SAP521006901012017021399CANOUT B16885
DTM01020170213
N1STCEGEP SAINT LAURENT 92 0642385892
预期结果:
GoodHDR.txt
仅包含HEADER0001
和HEADER0003
。
HDRHEADER0001 X004010850P
BEG00SAD202659801032017021699CANE
HDRHEADER0003 X004010850P
BEG00SAP521006901012017021399CANOUT B16885
DTM01020170213
N1STCEGEP SAINT LAURENT 92 0642385892
BadHDR.txt
包含HEADER0002
:
HDRHEADER0002 X004010850P
BEG00SAD202611701012017021499CANW
DTM01020170214
N1ST 92 0642397236
N315829 RUE BELLERIVE
N4MONTREAL
答案 0 :(得分:1)
下面的批处理代码希望以参数0001 0003
启动,以便从源文件生成两个输出文件,如同发布的那样。
@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceFile=HeaderList.txt"
set "FoundFile=GoodHDR.txt"
set "IgnoreFile=BadHDR.txt"
if "%~1" == "" goto ShowHelp
if "%~1" == "/?" goto ShowHelp
if not exist "%SourceFile%" goto NoHeaderList
del "%IgnoreFile%" 2>nul
del "%FoundFile%" 2>nul
rem Assign the headers passed as arguments to environment variables with
rem name HDR%~1X, HDR%~2X, HDR%~3X, etc. used later for quickly searching
rem for number of current header within the list of specified numbers.
rem All parameter strings not existing of exactly 4 digits are ignored.
set HeadersCount=0
:SetHeaders
set "HeaderNumber=%~1"
if "%HeaderNumber:~3,1%" == "" goto NextArgument
if not "%HeaderNumber:~4,1%" == "" goto NextArgument
for /F "delims=0123456789" %%I in ("%HeaderNumber%") do goto NextArgument
set "HDR%HeaderNumber%X=%HeaderNumber%"
set /A HeadersCount+=1
:NextArgument
shift /1
if not "%~1" == "" goto SetHeaders
if %HeadersCount% == 0 goto ShowHelp
rem Proces the header blocks in the source file.
set "OutputFile=%IgnoreFile%"
for /F "usebackq delims=" %%L in ("%SourceFile%") do call :ProcessLine "%%L"
rem Output a summary information of header block separation process.
if "%HeadersCount%" == "-1" set "HeadersCount="
if not defined HeadersCount (
echo All header blocks found and written to file "%FoundFile%".
goto EndBatch
)
set "SingularPlural= was"
if not %HeadersCount% == 1 set "SingularPlural=s were"
echo Following header block%SingularPlural% not found:
echo/
for /F "tokens=2 delims==" %%V in ('set HDR') do echo %%V
goto EndBatch
rem ProcessLine is a subroutine called from main FOR loop with
rem a line read from source file as first and only parameter.
rem It compares the beginning of the line with HDRHEADER. The line is
rem written to active output file if it does not start with that string.
rem Otherwise the string after HDRHEADER is extracted from the
rem line and searched in list of HDR environment variables.
rem Is the header in list of environment variables, this line and all
rem following lines up to next header line or end of source file are
rem written to file with found header blocks.
rem Otherwise the current header line and all following lines up to
rem next header line or end of source file are written to file with
rem header blocks to ignore.
rem Once all header blocks to find are indeed found and written completely
rem to the file for found header blocks, all remaining lines of source file
rem are written to the ignore file without further evaluation.
:ProcessLine
if not defined HeadersCount (
>>"%OutputFile%" echo %~1
goto :EOF
)
set "Line=%~1"
if not "%Line:~0,9%" == "HDRHEADER" (
>>"%OutputFile%" echo %~1
goto :EOF
)
set "HeaderLine=%Line:~9%"
for /F %%N in ("%HeaderLine%") do set "HeaderNumber=%%N"
set "OutputFile=%IgnoreFile%"
for /F %%N in ('set HDR%HeaderNumber%X 2^>nul') do (
set "HDR%HeaderNumber%X="
set /A HeadersCount-=1
set "OutputFile=%FoundFile%"
)
>>"%OutputFile%" echo %~1
if %HeadersCount% == 0 (
set "HeadersCount=-1"
) else if %HeadersCount% == -1 (
set "HeadersCount="
)
goto :EOF
:NoHeaderList
echo Error: The file "%SourceFile%" could not be not found in directory:
echo/
echo %CD%
goto EndBatch
:ShowHelp
echo Searches for specified headers in "%SourceFile%" and writes the
echo found header blocks to file "%FoundFile%" and all other to file
echo "%IgnoreFile%" and outputs the header blocks not found in file.
echo/
echo %~n0 XXXX [YYYY] [ZZZZ] [...]
echo/
echo %~nx0 must be called with at least one header number.
echo Only numbers with 4 digits are accepted as parameters.
:EndBatch
echo/
endlocal
pause
重定向运算符>>
和输出文件的当前名称在所有行的开头指定,使用命令 ECHO 打印当前行以避免在每行上附加尾随空格写入输出文件,如果一行以1
,2
,3
,...结尾,则行打印仍可正常工作...
关于此代码使用限制的一些其他说明:
编写批处理代码时避免使用延迟扩展,以便能够轻松处理包含感叹号的行。不使用延迟扩展的缺点是在命令行中包含具有特殊含义的字符的行,如&
,>
,<
,|
等。在错误的输出中甚至可以在当前目录中产生额外的,不需要的文件。
当然可以扩展批处理代码以适用于包含任何ANSI字符的源文件中的行,但这不是必需的。源文件示例,不包含任何“毒药”字符。
FOR 会忽略文本文件中读取行的空行。所以代码生成1或2个输出文件,没有从源文件复制的空行。
从源文件中读取行的主 FOR 循环会跳过以分号开头的所有行。如果这可能是一个问题,请在 FOR 命令行上指定在delims=
参数eol=
之前读取源文件中的行,其中一个字符在行的开头肯定不存在在源文件中。有关for /?
set /F
,eol=
和{{delims=
参数的详细信息,请参阅在命令提示符窗口tokens=
中运行时显示的命令 FOR 的帮助。 1}}。
分配给环境变量的字符串的长度加上等号和环境变量的名称限制为8192个字符。因此,此批处理代码不能用于行长度超过8187个字符的源文件。
命令行的长度也是有限的。最大长度取决于Windows的版本。因此,此批处理文件不能与大量标题号一起使用。
要了解使用的命令及其工作原理,请打开命令提示符窗口,执行以下命令,并完全阅读为每个命令显示的所有帮助页面。
call /?
del /?
echo /?
endlocal /?
for /?
goto /?
if /?
pause /?
rem /?
set /?
setlocal /?
shift /?
另请阅读Microsoft有关Using Command Redirection Operators的文章,了解有关>>
和2>nul
以及2^>nul
的详细信息,其中重定向运算符>
使用插入符号{{1}进行转义在解析 FOR 命令行时被解释为文字字符,但稍后在命令 FOR 执行命令 SET 时作为重定向操作符。< / p>