这个想法是获取出现404错误的URL及其上方的ID,以指示该URL属于它们,并进一步查找文件名文本并添加到输出文件中。
我一直在尝试循环findSTR以从先前找到的行号中找到行。有人可以帮忙吗?
示例文件:
FileName: LastABC-1563220.xml
-------------------------------
123456786
12348
1234DEF
-------------------------------
http://Product.com/1234DEF
HTTP/1.1 404 Not Found - 0.062000
http://Product.com/1234DEF_1
HTTP/1.1 200 OK - 0.031000
123456785
12349
1234EFG
-------------------------------
http://Product.com/1234EFG
HTTP/1.1 200 OK - 0.031000
123456784
12340
1234FGH
-------------------------------
http://Product.com/1234FGH
HTTP/1.1 200 OK - 0.031000
http://Product.com/1234FGH_1
HTTP/1.1 404 Not Found - 0.079000
http://Product.com/1234FGH_2
HTTP/1.1 404 Not Found - 0.067000
http://Product.com/1234FGH_4
HTTP/1.1 404 Not Found - 0.047000
所需的输出:
FileName: LastABC-1563220.xml
123456786 12348 1234DEF
http://Product.com/1234DEF
123456784 12340 1234FGH
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4
到目前为止我拥有的脚本:
del "%FailingURLS%" 2>nul
set numbers=
for /F "delims=:" %%a in ('findstr /I /N /C:"404 Not Found" %Formatedfile%') do (
set /A before=%%a-1
set "numbers=!numbers!!before!: "
)
(for /F "tokens=1* delims=:" %%a in ('findstr /N "^" %Formatedfile% ^| findstr /B "%numbers%"') do echo %%b) > %FailingURLS%
答案 0 :(得分:1)
这是我要这样做的方式:
@echo off
setlocal EnableDelayedExpansion
del PreviousLines.txt 2>nul
set "ids="
(for /F "delims=" %%a in (test.txt) do (
set "line=%%a"
if "!line:~0,9!" equ "FileName:" (
echo(!line!>> PreviousLines.txt
) else if "!line:~0,5!" equ "http:" (
if defined ids echo(!ids!>> PreviousLines.txt
set "ids="
echo(!line!>> PreviousLines.txt
) else if "!line:~0,4!" equ "HTTP" (
rem It is an "OK" or "Not Found" line...
rem If is "Not Found", show previous lines
if "!line:Not Found=!" neq "!line!" type PreviousLines.txt
rem Anyway, reset previous lines
del PreviousLines.txt 2>nul
set "ids="
) else if "!line:~0,5!" neq "-----" (
set "ids=!ids!!line! "
)
)) > FailingURLS.txt
输出:
FileName: LastABC-1563220.xml
123456786 12348 1234DEF
http://Product.com/1234DEF
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4
我不明白您为什么在123456784 12340 1234FGH
之前显示http://Product.com/1234FGH_1
ID,因为这样的ID属于http://Product.com/1234FGH
可以...
答案 1 :(得分:0)
您的问题就目前而言太广泛了,因此以下示例显示了一种从文件中检索“ 404” URL的方法,我认为这是您的主要问题。
@Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "Src=formattedfile.txt"
Set "Str=404 Not Found"
(Set LF=^
% 0x0A %
)
For /F %%A In ('Copy /Z "%~f0" Nul')Do Set "CR=%%A"
SetLocal EnableDelayedExpansion
FindStr /RC:".*!CR!*!LF!.*%Str%" "%Src%"
EndLocal
Pause
只需修改3
行中的值以匹配格式文本文件的名称
您提供的文件内容的输出:
http://Product.com/1234DEF
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4
Press any key to continue . . .
答案 2 :(得分:0)
以下是一个脚本(我们称其为extract-failed-urls.bat
),它演示了完成任务的一种可能方法-带有一些解释性的rem
注释,可帮助您了解会发生什么情况:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1" & rem // (`%~1` represents the first command line argument)
set "_URLP=://" & rem // (partial string that every listed URL contains)
set "_RESP=HTTP/1.1" & rem // (partial string that every response begins with)
set "_ERRN=404" & rem // (specific error number in response to recognise)
rem // Determine the total number of lines contained in the given file:
(for /F %%C in ('^< "%_FILE%" find /C /V ""') do set "CNT=%%C") || goto :EOF
rem // Read from the given file:
< "%_FILE%" (
rem // Clear IDs and URL buffer, and preset flag:
set "IDS=" & set "URL=" & set "FLAG=#"
setlocal EnableDelayedExpansion
rem // Read and write first line of file separately:
set /A "CNT-=1" & set "LINE=" & set /P LINE="" & < nul set /P ="!LINE!"
rem // Loop through the remaining lines:
for /L %%I in (1,1,!CNT!) do (
rem // Read a line and process only non-empty ones:
set /P LINE="" && (
rem // Try to split off response prefix:
set "REST=!LINE:*%_RESP% =!"
rem // Determine kind of current line:
if "!LINE:-=!" == "" (
rem // Line contains only hyphens `-`, so clear URL buffer:
set "URL="
) else if not "!LINE!" == "!LINE:*%_URLP%=!" (
rem // Line contains an URL, so store to URL buffer, set flag:
set "URL=!LINE!" & set "FLAG=#"
) else if "!LINE!" == "%_RESP% !REST!" (
rem // Line contains a response, so gather number:
for /F %%R in ("!REST!") do (
rem /* Specific error encountered, hence write IDs, if any,
rem clear IDs buffer, then write stored URL, if any: */
if "%%R" == "%_ERRN%" (
if defined IDS echo/& echo(!IDS!
set "IDS=" & if defined URL echo(!URL!
)
)
rem // Clear URL buffer and set flag:
set "URL=" & set "FLAG=#"
) else (
rem /* No other condition fulfilled, hence line contains an ID,
rem so put ID into IDs buffer, clear URL buffer and flag: */
if defined FLAG (set "IDS=!LINE!") else set "IDS=!IDS! !LINE!"
set "URL=" & set "FLAG="
)
)
)
endlocal
)
endlocal
exit /B
要针对名为sample.txt
的输入文件运行它,请使用如下命令行:
extract-failed-urls.bat "sample.txt"
要将输出写入名为failed-urls.txt
的另一个文件,请使用以下方法:
extract-failed-urls.bat "sample.txt" > "failed-urls.txt"
使用问题中样本输入文件中的数据,输出如下:
FileName: LastABC-1563220.xml 123456786 12348 1234DEF http://Product.com/1234DEF 123456784 12340 1234FGH http://Product.com/1234FGH_1 http://Product.com/1234FGH_2 http://Product.com/1234FGH_4
这种方法区分以下几种不同类型的输入线,它们的识别会触发某些相应的活动:
FileName:
开头的行):
-------------------------------
)的行:
://
的行:
HTTP/1.1
+ SPACE 开头:
404
: