我有一位女士正在给我发电话号码。他们以凌乱的方式发送。每次。所以我想从Skype复制她的整个邮件并让批处理文件解析保存的.txt文件,只搜索10个连续数字。
例如,她发给我:
Hello more numbers for settings please,
WYK-0123456789
CAMP-0123456789
0123456789
Include 0123456789
This is an urgent number: 0123456789
TIDO: 0123456789
Send to> 0123456789
它非常混乱,唯一的常数是10位数。所以我想.bat文件有些人如何扫描这个怪物并给我留下如下内容:
例如我想要的:
0123456789
0123456789
0123456789
0123456789
0123456789
0123456789
0123456789
我尝试了下面的this
@echo off
setlocal enableDelayedExpansion
(
for /f %%A in (
'findstr "^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]" yourFile.txt'
) do (
set "ln=%%A"
echo !ln:~0,9!
)
)>newFile.txt
不幸的是,只有当每行的开头以10位数开头并且在10位数位于行的中间或末尾的情况下,它才有用。
答案 0 :(得分:2)
不幸的是,以一般方式解决这个问题非常困难。下面的批处理文件正确地从您的示例文件中获取数字,但如果您的真实数据包含具有不同格式的数字,则程序将失败...当然,在这种情况下,只需要包含新格式在程序中!的 %_N_000DNC_MPF
;$PATH=/_N_WKS_DIR/_N_000DNC_WPD
; TRANSFER DNC
; !!! A NU SE STERGE !!!
; PROG:52343001
M30
%_N_DR_LIBER_BROSA_MPF
;$PATH=/_N_WKS_DIR/_N_ACASA_WPD
;PROGRAM LIBER BROSA DREAPTA
;RECHTE SPINDEL LEEREN
CHANDATA(2)
STOPRE
RE_SP2_SOLL_WZG="0" ;"LAMAJ_20_RAD" ;"MULTI_CDR_LUNG" ;"0"
RE_WZW
G04 F5
M30
%_N_STG_LIBER_BROSA_MPF
;$PATH=/_N_WKS_DIR/_N_ACASA_WPD
;PROGRAM LIBER BROSA STG.
;LINKE SPINDEL LEEREN
CHANDATA(1)
LI_SP1_SOLL_WZG="0" ; "DECKEL";"BURGHIU_39";"0"
LI_SP3_SOLL_WZG="DECKEL" ;"MULTI_CDR"
LI_WZW
G04 F2
M30
强>
:)
例如,如果一个带有10个字符的“字”,那么该程序将失败,这不是电话。号码,以数字开头......
答案 1 :(得分:2)
鉴于10位数字是文件每一行中的第一个数字部分(我们称之为numbers.txt
),在任何其他数字之前,您可以使用以下内容:
@echo off
setlocal EnableExtensions EnableDelayedExpansion
rem // Define constants here:
set "_FILE=.\numbers.txt"
set /A "_DIG=10"
rem // The first delimiter is TAB, the last one is SPACE:
for /F "usebackq tokens=1 delims= ^!#$%%&'()*+,-./:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^^_`abcdefghijklmnopqrstuvwxyz{|}~ " %%L in ("!_FILE!") do (
set "NUM=%%L#"
if "!NUM:~%_DIG%!"=="#" echo(%%L
)
endlocal
exit /B
这使用for /F
及其delims
选项字符串,其中包括除数字之外的大多数ASCII字符。您可以扩展delims
选项字符串以保留扩展字符(代码大于0x7F
的字符);确保 SPACE 是指定的最后一个字符。
此方法可以从以下行中提取10位数字:
garbage text>0123456789_more text0123-end
但是如果一条线看起来像这样就失败了,所以当第一个数字不是10位数时:
garbage text: 0123 tel. 0123456789; end
这是基于上述方法的综合解决方案。 delims
for /F
选项的字符列表将在此处自动创建。这可能需要几秒钟,但这只在最开始时完成一次,因此对于大文件,您可能无法识别这种开销:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_FILE=.\numbers.txt"
set /A "_DIG=10"
rem // Define global variables here:
set "$CHARS="
rem // Capture current code page and set Windows default one:
for /F "tokens=2 delims=:" %%P in ('chcp') do set /A "CP=%%P"
> nul chcp 437
rem /* Generate list of escaped characters other than numerals (escaped means every character
rem is preceded by `^`); there are some characters excluded:
rem - NUL (this cannot be stored in an environment variable and should not occur anyway),
rem - CR + LF, (they build up line-breaks, so they cannot occur within a line obviously),
rem - SPACE, (because this must be placed as the last character of the `delims`option),
rem - `"`, (because this impairs the quotation within the following code portion),
rem - `!` + `^` (they may lead to unexpected results when delayed expansion is enabled): */
setlocal EnableDelayedExpansion
for /L %%I in (0x01,1,0xFF) do (
rem // Exclude codes of aforementioned characters:
if %%I GEQ 0x30 if %%I LSS 0x3A (set "SKIP=#") else (set "SKIP=")
if not defined SKIP if %%I NEQ 0x00 if %%I NEQ 0x0A if %%I NEQ 0x0D (
if %%I NEQ 0x20 if %%I NEQ 0x21 if %%I NEQ 0x22 if %%I NEQ 0x5E (
rem // Convert code to character and append to list separated by `^`:
cmd /C exit %%I
for /F delims^=^ eol^= %%J in ('
forfiles /P "%~dp0." /M "%~nx0" /C "cmd /C echo 0x220x!=ExitCode:~-2!0x22"
') do (
set "$CHARS=!$CHARS!^^%%~J"
)
)
)
)
endlocal & set "$CHARS=%$CHARS%"
rem /* Apply escaped list of characters as delimiters and apply some of the characters
rem excluded before, namely SPACE, `"`, `!` and `^`;
rem read file using `type` in order to convert from Unicode, if applicable: */
for /F tokens^=1*^ eol^=^ ^ delims^=^!^"^^%$CHARS%^ %%K in ('type "%_FILE%"') do (
set "NUM=%%K#" & set "REST=%%L"
rem // Test whether extracted numeric string holds the given number of digits:
setlocal EnableDelayedExpansion
if "!NUM:~%_DIG%!"=="#" echo(%%K
endlocal
rem /* Current line holds more than a single numeric portion, so process them in a
rem sub-routine; this is not called if the line contains a single number only: */
if defined REST call :SUB REST
)
rem // Restore previous code page:
> nul chcp %CP%
endlocal
exit /B
:SUB ref_string
setlocal DisableDelayedExpansion
setlocal EnableDelayedExpansion
set "STR=!%~1!"
rem // Parse line string using the same approach as in the main routine:
:LOOP
if defined STR (
for /F tokens^=1*^ eol^=^ ^ delims^=^^^!^"^^^^%$CHARS%^ %%E in ("!STR!") do (
endlocal
set "NUM=%%E#" & set "STR=%%F"
setlocal EnableDelayedExpansion
rem // Test whether extracted numeric string holds the given number of digits:
if "!NUM:~%_DIG%!"=="#" echo(%%E
)
rem // Loop back if there are still more numeric parts encountered:
goto :LOOP
)
endlocal
endlocal
exit /B
这种方法可以检测文件中各处的10位数字,即使一行中有多个数字。
答案 2 :(得分:2)
@ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q44134518.txt"
SET "outfile=%destdir%\outfile.txt"
ECHO %time%
(
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO SET "line=%%a"&CALL :process
)>"%outfile%"
ECHO %time%
GOTO :EOF
:lopchar
SET "line=%line:~1%"
:process
IF "%line:~9,1%"=="" GOTO :eof
SET "candidate=%line:~0,10%"
SET /a count=0
:testlp
SET "char=%candidate:~0,1%"
IF "%char%" gtr "9" GOTO lopchar
IF "%char%" lss "0" GOTO lopchar
SET /a count+=1
IF %count% lss 10 SET "candidate=%candidate:~1%"&GOTO testlp
ECHO %line:~0,10%
GOTO :eof
您需要更改sourcedir
和destdir
的设置以适合您的具体情况。
我使用了一个名为q44134518.txt
的文件,其中包含您的数据以及一些额外的测试信息。
生成定义为%outfile%
的文件将每行数据读取到%%a
line
。
从line
开始处理每个:process
。查看该行是否为10个或更多字符,如果不是终止子例程。
由于该行为10个或更多字符,请选择前10个到candidate
并将count
清除为0。
将第一个字符分配给char
,并测试&gt;'9'或小于'0'。如果其中一个为真,请删掉line
的第一个字符,然后重试(直到我们有数字或line
有9个或更少字符)
计算每个连续的数字。如果我们还没有计算10,请从candidate
中删除第一个字符并再次检查。
当我们达到10个连续的数字时,echo
line
的前10个字符,所有这些都是数字和所需的数据。
答案 3 :(得分:1)
只是另一种选择
@echo off
setlocal enableextensions disabledelayedexpansion
rem Configure
set "file=input.txt"
rem Initializacion
set "counter=0" & set "number="
rem Convert file to a character per line and add ending line
(for /f "delims=" %%a in ('
^( cmd /q /u /c type "%file%" ^& echo( ^)^| find /v ""
') do (
rem See if current character is a number
(for /f "delims=0123456789" %%b in ("%%a") do (
rem Not a number, see if we have retrieved 10 consecutive numbers
set /a "1/((counter+1)%%11)" || (
rem We probably have 10 numbers, check and output data
setlocal enabledelayedexpansion
if !counter!==10 echo !number!
endlocal
)
rem As current character is not a number, initialize
set "counter=0" & set "number="
)) || (
rem Number readed, increase counter and concatenate
set /a "counter+=1"
setlocal enabledelayedexpansion
for %%b in ("!number!") do endlocal & set "number=%%~b%%a"
)
)) 2>nul
基本思想是使用unicode输出启动cmd
实例,从此实例中键入文件并使用find
过滤两个字节输出,将每个输入行扩展为每行输出一个字符
一旦我们将每个字符放在一个单独的行中,并且在for /f
命令中处理此输出,我们只需要连接过多的数字,直到找到非数字字符。此时我们检查是否有一组10个数字被加入,并在需要时输出数据。