我有一个文本文件,它是一个长字符串,如下所示:
ISA*00*GARBAGE~ST*TEST*TEST~CLP*TEST~ST*TEST*TEST~CLP*TEST~ST*TEST*TEST~CLP*TEST~GE*GARBAGE*~
我需要它看起来像这样:
~ST*TEST*TEST~CLP*TEST
~ST*TEST*TEST~CLP*TEST
~ST*TEST*TEST~CLP*TEST
我首先尝试在每个~ST
添加一行来分割字符串,但我不能为我的生活做出这样的事情。我尝试了各种脚本,但我认为查找/替换脚本效果最好。
@echo off
setlocal enabledelayedexpansion
set INTEXTFILE=test.txt
set OUTTEXTFILE=test_out.txt
set SEARCHTEXT=~ST
set REPLACETEXT=~ST
for /f "tokens=1,* delims=~" %%A in ( '"type %INTEXTFILE%"') do (
SET string=%%A
SET modified=!string:%SEARCHTEXT%=%REPLACETEXT%!
echo !modified! >> %OUTTEXTFILE%
)
del %INTEXTFILE%
rename %OUTTEXTFILE% %INTEXTFILE%
在此处找到How to replace substrings in windows batch file
但是我被卡住了因为(1)特殊字符~
使代码根本不起作用。它给了我这个结果:
string:~ST=~ST
如果在"~ST"
周围使用引号,则代码根本不执行任何操作。 (2)我无法弄清楚如何在~ST
之前添加换行符。
最后一项任务是在执行所有拆分后删除ISA*00*blahblahblah
和~GE*blahblahblah
行。但我仍然坚持~ST
部分的分裂。
有什么建议吗?
答案 0 :(得分:3)
@echo off
setlocal EnableDelayedExpansion
rem Set next variable to the number of "~" chars that delimit the wanted fields, or more
set "maxTokens=7"
rem Define the delimiters that starts a new field
set "delims=/ST/GE/"
for /F "delims=" %%a in (test.txt) do (
set "line=%%a"
set "field="
rem Process up to maxTokens per line;
rem this is a trick to avoid a call to a subroutine that have a goto loop
for /L %%i in (0,1,%maxTokens%) do if defined line (
for /F "tokens=1* delims=~" %%b in ("!line!") do (
rem Get the first token in the line separated by "~" delimiter
set "token=%%b"
rem ... and update the rest of the line
set "line=%%c"
rem Get the first two chars after "~" token like "ST", "CL" or "GE";
rem if they are "ST" or "GE":
for %%d in ("!token:~0,2!") do if "!delims:/%%~d/=!" neq "%delims%" (
rem Start a new field: show previous one, if any
if defined field echo !field!
if "%%~d" equ "ST" (
set "field=~%%b"
) else (
rem It is "GE": cancel rest of line
set "line="
)
) else (
rem It is "CL" token: join it to current field, if any
if defined field set "field=!field!~%%b"
)
)
)
)
输入:
ISA*00*GARBAGE~ST*TEST1*TEST1~CLP*TEST1~ST*TEST2*TEST2~CLP*TEST2~ST*TEST3*TEST3~CLP*TEST3~GE*GARBAGE*~CLP~TESTX
输出:
~ST*TEST1*TEST1~CLP*TEST1
~ST*TEST2*TEST2~CLP*TEST2
~ST*TEST3*TEST3~CLP*TEST3
答案 1 :(得分:0)
~
不能用作子字符串替换语法%VARIABLE:SEARCH_STRING=REPLACE_STRING%
中搜索字符串的第一个字符,因为它用于标记子字符串扩展%VARIABLE:~POSITION,LENGTH%
(类型{{1更多信息)。
假设您的文本文件仅包含一行文本且不超过大约8 KB的大小,我会看到以下选项来完成您的任务。该脚本使用子字符串替换语法set/?
; %VARIABLE:*SEARCH_STRING=REPLACE_STRING%
定义匹配第一次出现*
的所有内容:
SEARCH_STRING
以下限制适用于此方法:
@echo off
setlocal EnableExtensions EnableDelayedExpansion
rem initialise constants:
set "INFILE=test_in.txt"
set "OUTFILE=test_out.txt"
set "SEARCH=ST"
set "TAIL=GE"
rem read single-line file content into variable:
< "%INFILE%" set /P "DATA="
rem remove everything before first `~%SEARCH%`:
set "DATA=~%SEARCH%!DATA:*~%SEARCH%=!"
rem call sub-routine, redirect its output:
> "%OUTFILE%" call :LOOP
endlocal
goto :EOF
:LOOP
rem extract portion right to first `~%SEARCH%`:
set "RIGHT=!DATA:*~%SEARCH%=!"
rem skip rest if no match found:
if "!RIGHT!"=="!DATA!" goto :TAIL
rem extract portion left to first `~%SEARCH%`, including `~`:
set "LEFT=!DATA:%SEARCH%%RIGHT%=!"
rem the last character must be a `~`;
rem so remove it; `echo` outputs a trailing line-break;
rem the `if` avoids an empty line at the beginning;
rem the unwanted part at the beginning is removed implicitly:
if not "!LEFT:~,-1!"=="" echo(!LEFT:~,-1!
rem output `~%SEARCH%` without trailing line-break:
< nul set /P "DUMMY=~%SEARCH%"
rem store remainder for next iteration:
set "DATA=!RIGHT!"
rem loop back if remainder is not empty:
if not "!DATA!"=="" goto :LOOP
:TAIL
rem this section removes the part starting at `~%TAIL%`:
set "RIGHT=!DATA:*~%TAIL%=!"
if "!RIGHT!"=="!DATA!" goto :EOF
set "LEFT=!DATA:%TAIL%%RIGHT%=!"
rem output part before `~%TAIL%` without trailing line-break:
< nul set /P "DUMMY=!LEFT:~,-1!"
goto :EOF
的一个实例,发生在~GE
的所有实例之后; ~ST
个实例之间始终至少有一个字符; ~ST
,"
,%
,{ {1}}; 答案 2 :(得分:0)
不要重新发明轮子,使用正则表达式替换工具,例如sed
或JREPL.BAT:
call jrepl "^.*?~ST(.+?)~GE.*$" "'~ST'+$1.replace(/~ST/g,'\r\n$&')" /jmatch <in.txt >out.txt