我有一个巨大的文本文件,每行包含一个带有模式FEATURE_ 的字符串。 我想从这个txt文件中读取每一行,并从文件中删除包含相同FEATURE _ 字符串的所有其他行。
请建议DOS和perl cmd执行此操作
例如
输入:
#ifdef FEATURE_ABCD
#ifdef FEATURE_GHDI
#ifdef FEATURE_ABCD
#ifdef FEATURE_WXYZ
#ifdef FEATURE_ABCD
#ifdef FEATURE_WXYZ
#ifdef FEATURE_GHDI
#ifdef FEATUREGHDI
#define FEATURE_ABCD
#define FEATUREGHDI
/* FEATURE_GHDI */
输出:
#ifdef FEATURE_ABCD
#ifdef FEATURE_GHDI
#ifdef FEATURE_WXYZ
#ifdef FEATUREGHDI
答案 0 :(得分:2)
假设您的文字文件为FEATURE.TXT
,请尝试以下操作:
@ECHO OFF & setlocal enabledelayedexpansion
for /f "delims=" %%i in (FEATURE.TXT) do (
set "line0=%%i"
set "line=!line0:*FEATURE=!"
if not "!line0!"=="!line!" (
for /f %%j in ("!line!") do set "line=%%j"
if not defined $a!line! (
set "$a!line!=!line!"
(echo(!line0!)
)
)
)
如果在>>OUTPUT.TXT
命令之后放置(echo(!line0!)
,则可以将输出重定向到文件。
输出是:
#ifdef FEATURE_ABCD
#ifdef FEATURE_GHDI
#ifdef FEATURE_WXYZ
#ifdef FEATUREGHDI
编辑:加快代码的一些改进。
答案 1 :(得分:1)
@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
FOR /f "delims==" %%i IN ('set found 2^>nul') DO SET "%%i="
SET found=FEATURE_
SET /a count=0
(
FOR /f "delims=" %%i IN ('findstr /n "$" ^<feature.txt') DO (
SET feature=%%i
SET line=!feature:*:=!
IF DEFINED line (
SET feature=!line:*FEATURE_=!
IF "!line!"=="!feature!" (ECHO(!line!) ELSE (
FOR /f %%f IN ("!feature!") DO SET feature=%%f&SET found|FINDSTR /e "=%%f" >NUL
IF ERRORLEVEL 1 (
ECHO(!line!
SET found!count!=!feature!
SET /a count+=1
)
)
) ELSE (ECHO()
)
) >newfile.txt
每行,包括空行,
foundcounter
BUT
对于Aacin的评论,也许你应该坐下来喝一杯热茶,想想你真正想要的东西。
如果按照你所说的那样做,那么序列
#ifdef FEATURE_ABCD
something
endif
或
#ifdef FEATURE_ABCD something
可能会产生你真正不想要的东西 - 以及
#ifdef FEATURE_ABCD
...
#define FEATURE_ABCD
...
#ifdef FEATURE_ABCD
...
答案 2 :(得分:0)
最小的代码和功能:
@echo OFF
Set "File=Input.txt"
Set "OutputFile=Output.txt"
For /F "Usebackq Tokens=2,* delims= " %%# in ("%File%") Do (
Echo "%%#" | Find /I "Feature_" 1>NUL && (
(Type "Features.txt" | FIND /I "%%#" 1>NUL) || (Echo %%#>>"%OutputFile%")))
代码省略了没有“ Feature _ ”字符串的行,如果找到有效字符串,则在输出文件中找到该字符串是否已存在以添加或省略该字符串。
使用输入文本测试,收到正确的输出:
#ifdef FEATURE_ABCD
#ifdef FEATURE_GHDI
#ifdef FEATURE_WXYZ
答案 3 :(得分:0)
有几种不同的方法可以解决这个问题,每种方法都有自己的特点。最快的解决方案在输入文件的每一行中执行最少数量的命令,特别是避免外部命令。下面的批处理文件旨在快速处理具有许多匹配行的巨大文本文件。该方法首先创建一个辅助文件,其中包含要删除的行数(使用FINDSTR命令),然后使用此文件和原始文件执行文件合并过程。
@echo off
setlocal EnableDelayedExpansion
set string=FEATURE_
rem Run FINDSTR to find the lines with the target string and store the numbers of the lines that will be deleted
(for /F "tokens=1* delims=:" %%a in ('findstr /N "%string%" inputFile.txt') do (
set "line=%%b"
for /F %%c in ("!line:*%string%=!") do (
rem If this is the first line with the target string
if not defined string[%%c] (
rem Define the target string (and preserve this line)
set string[%%c]=0
) else (
rem Mark this line for deletion
echo %%a
)
)
)) > linesToDelete.txt
rem Insert the EndOfFile mark
echo 0 >> linesToDelete.txt
rem Merge numbers of lines to delete (from STDIN) and input file (from FOR command)
< linesToDelete.txt (
set /P lineToDelete=
for /F "tokens=1* delims=:" %%a in ('findstr /N "^" inputFile.txt') do (
if %%a neq !lineToDelete! (
rem Preserve this line
echo(%%b
) else (
rem Ignore this line and pass to next one to delete
set /P lineToDelete=
)
)
) > outputFile.txt
del linesToDelete.txt
如果输入文件包含特殊的批处理字符,例如! < | > &
,则此批处理程序将失败。如果需要,可以修复此限制。