Question

我有一位女士正在给我发电话号码。他们以凌乱的方式发送。每次。所以我想从Skype复制她的整个邮件并让批处理文件解析保存的.txt文件，只搜索10个连续数字。

例如，她发给我：

Hello more numbers for settings please,
WYK-0123456789 
CAMP-0123456789 
0123456789
Include 0123456789
This is an urgent number: 0123456789 
TIDO: 0123456789
Send to> 0123456789

它非常混乱，唯一的常数是10位数。所以我想.bat文件有些人如何扫描这个怪物并给我留下如下内容：

例如我想要的：

我尝试了下面的this

@echo off
setlocal enableDelayedExpansion
(
  for /f %%A in (
    'findstr "^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]" yourFile.txt'
  ) do (
    set "ln=%%A"
    echo !ln:~0,9!
  )
)>newFile.txt

不幸的是，只有当每行的开头以10位数开头并且在10位数位于行的中间或末尾的情况下，它才有用。

Answer 1

不幸的是，以一般方式解决这个问题非常困难。下面的批处理文件正确地从您的示例文件中获取数字，但如果您的真实数据包含具有不同格式的数字，则程序将失败...当然，在这种情况下，只需要包含新格式在程序中！的 %_N_000DNC_MPF ;$PATH=/_N_WKS_DIR/_N_000DNC_WPD ; TRANSFER DNC ; !!! A NU SE STERGE !!! ; PROG:52343001 M30 %_N_DR_LIBER_BROSA_MPF ;$PATH=/_N_WKS_DIR/_N_ACASA_WPD ;PROGRAM LIBER BROSA DREAPTA ;RECHTE SPINDEL LEEREN CHANDATA(2) STOPRE RE_SP2_SOLL_WZG="0" ;"LAMAJ_20_RAD" ;"MULTI_CDR_LUNG" ;"0" RE_WZW G04 F5 M30 %_N_STG_LIBER_BROSA_MPF ;$PATH=/_N_WKS_DIR/_N_ACASA_WPD ;PROGRAM LIBER BROSA STG. ;LINKE SPINDEL LEEREN CHANDATA(1) LI_SP1_SOLL_WZG="0" ; "DECKEL";"BURGHIU_39";"0" LI_SP3_SOLL_WZG="DECKEL" ;"MULTI_CDR" LI_WZW G04 F2 M30

:)

例如，如果一个带有10个字符的“字”，那么该程序将失败，这不是电话。号码，以数字开头......

Answer 2

鉴于10位数字是文件每一行中的第一个数字部分（我们称之为numbers.txt），在任何其他数字之前，您可以使用以下内容：

@echo off
setlocal EnableExtensions EnableDelayedExpansion

rem // Define constants here:
set "_FILE=.\numbers.txt"
set /A "_DIG=10"

rem // The first delimiter is TAB, the last one is SPACE:
for /F "usebackq tokens=1 delims=   ^!#$%%&'()*+,-./:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^^_`abcdefghijklmnopqrstuvwxyz{|}~ " %%L in ("!_FILE!") do (
    set "NUM=%%L#"
    if "!NUM:~%_DIG%!"=="#" echo(%%L
)

endlocal
exit /B

这使用for /F及其delims选项字符串，其中包括除数字之外的大多数ASCII字符。您可以扩展delims选项字符串以保留扩展字符（代码大于0x7F的字符）;确保 SPACE 是指定的最后一个字符。

此方法可以从以下行中提取10位数字：

garbage text>0123456789_more text0123-end

但是如果一条线看起来像这样就失败了，所以当第一个数字不是10位数时：

garbage text: 0123 tel. 0123456789; end

这是基于上述方法的综合解决方案。 delims for /F选项的字符列表将在此处自动创建。这可能需要几秒钟，但这只在最开始时完成一次，因此对于大文件，您可能无法识别这种开销：

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "_FILE=.\numbers.txt"
set /A "_DIG=10"

rem // Define global variables here:
set "$CHARS="

rem // Capture current code page and set Windows default one:
for /F "tokens=2 delims=:" %%P in ('chcp') do set /A "CP=%%P"
> nul chcp 437

rem /* Generate list of escaped characters other than numerals (escaped means every character
rem    is preceded by `^`); there are some characters excluded:
rem    - NUL (this cannot be stored in an environment variable and should not occur anyway),
rem    - CR + LF, (they build up line-breaks, so they cannot occur within a line obviously),
rem    - SPACE, (because this must be placed as the last character of the `delims`option),
rem    - `"`, (because this impairs the quotation within the following code portion),
rem    - `!` + `^` (they may lead to unexpected results when delayed expansion is enabled): */
setlocal EnableDelayedExpansion
for /L %%I in (0x01,1,0xFF) do (
    rem // Exclude codes of aforementioned characters:
    if %%I GEQ 0x30 if %%I LSS 0x3A (set "SKIP=#") else (set "SKIP=")
    if not defined SKIP if %%I NEQ 0x00 if %%I NEQ 0x0A if %%I NEQ 0x0D (
        if %%I NEQ 0x20 if %%I NEQ 0x21 if %%I NEQ 0x22 if %%I NEQ 0x5E (
            rem // Convert code to character and append to list separated by `^`:
            cmd /C exit %%I
            for /F delims^=^ eol^= %%J in ('
                forfiles /P "%~dp0." /M "%~nx0" /C "cmd /C echo 0x220x!=ExitCode:~-2!0x22"
            ') do (
                set "$CHARS=!$CHARS!^^%%~J"
            )
        )
    )
)
endlocal & set "$CHARS=%$CHARS%"

rem /* Apply escaped list of characters as delimiters and apply some of the characters
rem    excluded before, namely SPACE, `"`, `!` and `^`;
rem    read file using `type` in order to convert from Unicode, if applicable: */
for /F tokens^=1*^ eol^=^ ^ delims^=^!^"^^%$CHARS%^  %%K in ('type "%_FILE%"') do (
    set "NUM=%%K#" & set "REST=%%L"
    rem // Test whether extracted numeric string holds the given number of digits:
    setlocal EnableDelayedExpansion
    if "!NUM:~%_DIG%!"=="#" echo(%%K
    endlocal
    rem /* Current line holds more than a single numeric portion, so process them in a
    rem    sub-routine; this is not called if the line contains a single number only: */
    if defined REST call :SUB REST
)

rem // Restore previous code page:
> nul chcp %CP%

endlocal
exit /B


:SUB  ref_string
    setlocal DisableDelayedExpansion
    setlocal EnableDelayedExpansion
    set "STR=!%~1!"
    rem // Parse line string using the same approach as in the main routine:
    :LOOP
    if defined STR (
        for /F tokens^=1*^ eol^=^ ^ delims^=^^^!^"^^^^%$CHARS%^  %%E in ("!STR!") do (
            endlocal
            set "NUM=%%E#" & set "STR=%%F"
            setlocal EnableDelayedExpansion
            rem // Test whether extracted numeric string holds the given number of digits:
            if "!NUM:~%_DIG%!"=="#" echo(%%E
        )
        rem // Loop back if there are still more numeric parts encountered:
        goto :LOOP
    )
    endlocal
    endlocal
    exit /B

这种方法可以检测文件中各处的10位数字，即使一行中有多个数字。

Answer 3

@ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q44134518.txt"
SET "outfile=%destdir%\outfile.txt"
ECHO %time%
(
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO SET "line=%%a"&CALL :process
)>"%outfile%"
ECHO %time%

GOTO :EOF

:lopchar
SET "line=%line:~1%"
:process
IF "%line:~9,1%"=="" GOTO :eof
SET "candidate=%line:~0,10%"
SET /a count=0
:testlp
SET "char=%candidate:~0,1%"
IF "%char%" gtr "9" GOTO lopchar
IF "%char%" lss "0" GOTO lopchar
SET /a count+=1
IF %count% lss 10 SET "candidate=%candidate:~1%"&GOTO testlp
ECHO %line:~0,10%
GOTO :eof

您需要更改sourcedir和destdir的设置以适合您的具体情况。我使用了一个名为q44134518.txt的文件，其中包含您的数据以及一些额外的测试信息。

生成定义为％outfile％

的文件

将每行数据读取到%%a line。

从line开始处理每个:process。查看该行是否为10个或更多字符，如果不是终止子例程。

由于该行为10个或更多字符，请选择前10个到candidate并将count清除为0。

将第一个字符分配给char，并测试＆gt;'9'或小于'0'。如果其中一个为真，请删掉line的第一个字符，然后重试（直到我们有数字或line有9个或更少字符）

计算每个连续的数字。如果我们还没有计算10，请从candidate中删除第一个字符并再次检查。

当我们达到10个连续的数字时，echo line的前10个字符，所有这些都是数字和所需的数据。

Answer 4

只是另一种选择

@echo off
    setlocal enableextensions disabledelayedexpansion

    rem Configure
    set "file=input.txt"

    rem Initializacion
    set "counter=0" & set "number="

    rem Convert file to a character per line and add ending line
    (for /f "delims=" %%a in ('
        ^( cmd /q /u /c type "%file%" ^& echo( ^)^| find /v ""
    ') do (
        rem See if current character is a number
        (for /f "delims=0123456789" %%b in ("%%a") do (
            rem Not a number, see if we have retrieved 10 consecutive numbers 
            set /a "1/((counter+1)%%11)" || (
                rem We probably have 10 numbers, check and output data
                setlocal enabledelayedexpansion
                if !counter!==10 echo !number!
                endlocal
            )
            rem As current character is not a number, initialize
            set "counter=0" & set "number="
        )) || ( 
            rem Number readed, increase counter and concatenate
            set /a "counter+=1"
            setlocal enabledelayedexpansion
            for %%b in ("!number!") do endlocal & set "number=%%~b%%a"
        )
    )) 2>nul

基本思想是使用unicode输出启动cmd实例，从此实例中键入文件并使用find过滤两个字节输出，将每个输入行扩展为每行输出一个字符

一旦我们将每个字符放在一个单独的行中，并且在for /f命令中处理此输出，我们只需要连接过多的数字，直到找到非数字字符。此时我们检查是否有一组10个数字被加入，并在需要时输出数据。

搜索10个连续的个位数

4 个答案: