Question

我过去曾使用过简单的批处理文件来查找单个txt文件中的字符串并合并多个txt文件，但这个文件有点复杂，我不知道从哪里开始。

以下是我要做的事情的细分：

有一个包含300多个txt文件的文件夹

每个txt文件至少有一个，但可能有数百个字符串＆＃34; documentID：＆＃34;，其后有6个章程。

想要一个txt文件或带有txt文件文件名的csv，并且每次都要输入字符串＆＃34; documentID：＆＃34;在txt文件中找到 - 后面的6个字符

示例：

jsmith.txt：

<type>not needed</type>
<version>1.0</version>
not needed,not needed, not needed, documentID:NEED01, not needed
not needed,not needed, not needed, documentID:NEED02, not needed

jdoe.txt

<type>not needed</type>
<version>1.0</version>
not needed,not needed, not needed, documentID:NEED03, not needed

期望的输出：

new.txt

jsmith, NEED01, NEED02
jdoe, NEED03

Answer 1

@echo off
setlocal EnableDelayedExpansion

for %%A in (*.txt) do (
    set "out="
    for /f "usebackq tokens=*" %%B in (`findstr /rc:"documentID:[^^,]*" "%%A"`) do (
        set "str=%%B"
        set "val=!str:*documentID:=!"
        set "tail=!val:*,=!"
        call set "res=%%val:,!tail!=%%"
        set "out=!out!, !res!"
    )
    echo %%~nA!out!
)

endlocal


Rem  For mentioned jsmith.txt and jdoe.txt will output
Rem
Rem  jdoe, NEED03
Rem  jsmith, NEED01, NEED02

第一个for循环遍历当前目录中的所有*.txt个文件。

第二个for循环遍历findstr命令的输出。

findstr命令查找具有documentID:*,模式的字符串。 documentID字区分大小写。 ,符号应遵循该模式。

set "val=!str:*documentID:=!"命令会删除找到的字符串和documentID:字的开头。

set "tail=!val:*,=!"命令接收documentID:*,模式后的所有符号。

call set "res=%%val:,!tail!=%%"命令在documentID:字后面提取值。

Answer 2

以下脚本可以执行您想要的操作，假设每个需要的字符串部分都在其自己的行中：

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "_LOCATION=%~dp0."    & rem // (path to the directory containing the input files)
set "_PATTERN=*.txt"      & rem // (pattern the input files need to match)
set "_PREFIX=documentID:" & rem // (string that precedes the needed string portion)
set "_SEPAR=, "           & rem // (field separator for both input and output files)

rem // Loop through all matching input files:
for %%F in ("%_LOCATION%\%_PATTERN%") do (
    rem // Initialise collection variable with the name of the currently iterated file:
    set "COLLECT=%%~nxF"
    rem // Search current file for predefined prefix and loop over all applicable lines:
    for /F delims^=^ eol^= %%L in ('findstr /C:"%_PREFIX%" "%%~F"') do (
        rem // Store currently processed line:
        set "ITEM=" & set "LINE=%%L"
        rem // Toggle delayed expansion to not lose any exclamation marks `!`:
        setlocal EnableDelayedExpansion
        rem /* Split off the prefix and everything in front of it, then split off the
        rem    next separator (regard first character only) and everything behind: */
        for /F "delims=%_SEPAR:~,1% eol=%_SEPAR:~,1%" %%K in ("!LINE:*%_PREFIX%=!") do (
            endlocal
            set "ITEM=%%K"
            setlocal EnableDelayedExpansion
        )
        rem /* Append extracted string portion to collection variable and transport the
        rem    result over the `endlocal` barrier using the `for /F` command: */
        for /F "delims= eol=:" %%K in ("!COLLECT!%_SEPAR%!ITEM!") do (
            endlocal
            set "COLLECT=%%K"
        )
    )
    rem // Return the collected line for the currently iterated file:
    setlocal EnableDelayedExpansion
    echo(!COLLECT!
    endlocal
)

endlocal
exit /B

要将结果存储在文本文件中，请使用重定向;例如，脚本保存为merge-files.bat，生成的文本文件应为D:\result\new.csv，如下所示调用脚本：

merge-files.bat > "D:\result\new.csv"

批处理文件以返回文件名和多个字符串

2 个答案: