我有一个批处理文件,它从一个看起来像这样的txt文件中获取输入。
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.
Server name lak-print01
Printer name Microsoft XPS Document Writer
Share name
Driver name Microsoft XPS Document Writer
Port name XPSPort:
Comment
Location
Print processor WinPrint
Data type RAW
Parameters
Attributes 64
Priority 1
Default priority 1
Average pages per minute 0
Printer status Idle
Extended printer status Unknown
Detected error state Unknown
Extended detected error state Unknown
Server name lak-print01
Printer name 4250_Q1
Share name 4250_Q1
Driver name Canon iR5055/iR5065 PCL5e
Port name IP_192.168.202.84
Comment Audit Department in Lakewood Operations
Location Operations Center
Print processor WinPrint
Data type RAW
Parameters
Attributes 10826
Priority 1
Default priority 0
Average pages per minute 0
Printer status Idle
Extended printer status Unknown
Detected error state Unknown
Extended detected error state Unknown
Server name lak-print01
Printer name 3130_Q1
Share name 3130_Q1
Driver name Canon iR1020/1024/1025 PCL5e
Port name IP_192.168.202.11
Comment Canon iR1025
Location Operations Center
Print processor WinPrint
Data type RAW
Parameters
Attributes 10824
Priority 1
Default priority 0
Average pages per minute 0
Printer status Idle
Extended printer status Unknown
Detected error state Unknown
Extended detected error state Unknown
并解析它以获取列表中的某些内容,例如服务器名称,打印机名称,驱动程序名称等。然后将每个块条目放入其自己的逗号分隔行中。所以我可以有多行,每行一个文本块,每列都有特定的信息。其中一些txt文件有100多个条目。当它进行解析时,我尝试解析的每个文件需要5-10分钟
Parse代码如下。
:Parselak-print01
SETLOCAL enabledelayedexpansion
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
(FOR /f "delims=" %%a IN (lak-print01.txt) DO CALL :analyse "%%a")>lak-print01.csv
attrib +h lak-print01.csv
GOTO :EOF
:analyse
SET "line=%~1"
SET /a fieldnum=0
FOR %%s IN ("Server name" "Printer name" "Driver name"
"Port name" "Location" "Comment" "Printer status"
"Extended detected error state") DO CALL :setfield %%~s
GOTO :eof
:setfield
SET /a fieldnum+=1
SET "linem=!line:*%* =!"
SET "linet=%* %linem%"
IF "%linet%" neq "%line%" GOTO :EOF
IF "%linem%"=="%line%" GOTO :EOF
SET "$%fieldnum%=%linem%"
IF NOT DEFINED $8 GOTO :EOF
SET "line="
FOR /l %%q IN (1,1,7) DO SET "line=!line!,!$%%q!"
ECHO !line:~1!
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
GOTO :eof
我得到的输出是
lak-print01,Microsoft XPS Document Writer,Microsoft XPS Document Writer,XPSPort:,,,Idle
lak-print01,4250_Q1,Canon iR5055/iR5065 PCL5e,IP_192.168.202.84,Operations Center,Audit Department in Lakewood Operations,Idle
lak-print01,3130_Q1,Canon iR1020/1024/1025 PCL5e,IP_192.168.202.11,Operations Center,Canon iR1025 ,Idle
lak-print01,1106_TRN,HP LaserJet P2050 Series PCL6,IP_172.16.10.97,Monroe,HP P2055DN,Idle
lak-print01,1101_TRN,HP LaserJet P2050 Series PCL6,IP_10.3.3.22,Burlington,Training Room printer,Idle
lak-print01,1096_Q3,Canon iR1020/1024/1025 PCL5e,IP_192.168.96.248,Silverdale,Canon iR 1025,Idle
lak-print01,1096_Q2,Kyocera Mita KM-5035 KX,IP_192.168.96.13,Silverdale,Kyocera CS-5035 all in one,Idle
lak-print01,1096_Q1,HP LaserJet P4010_P4510 Series PCL 6,IP_192.168.96.12,Silverdale,HP 4015,Idle
lak-print01,1095_Q3,HP LaserJet P4010_P4510 Series PCL 6,IP_192.168.95.247,Sequim,HP LaserJet 4015x,Idle
一切都很完美,而且代码按预期工作......但它的速度非常慢!
如何加快速度?问题是没有真正的delim并且令牌有所不同..例如注释需要令牌2,但打印机名称需要令牌3。
任何有助于提高解析速度的帮助..程序运行良好,但在解析过程中速度非常慢。
答案 0 :(得分:6)
如果您需要速度,我建议Marpa,一般的BNF解析器,in Perl - code,output。
需要一段时间才能习惯,但是能够完成工作并为您提供一个非常强大的工具,您可以轻松使用它 - 请注意语法与输入的相似程度。
希望这有帮助。
答案 1 :(得分:3)
使用Call
的速度非常慢 - 看看这是否能为您提供所需的输出,听听它的速度有多快会很有趣。
@echo off
:Parselak-print01
SETLOCAL enabledelayedexpansion
(FOR /f "delims=" %%a IN (lak-print01.txt) DO (
for /f "tokens=1,2,*" %%b in ("%%a") do (
if "%%b"=="Server" set "server=%%d"
if "%%b"=="Printer" if "%%c"=="name" (set "printer=%%d") else (set "printerstatus=%%d")
if "%%b"=="Driver" set "driver=%%d"
if "%%b"=="Port" set "port=%%d"
if "%%b"=="Location" for /f "tokens=1,*" %%e in ("%%a") do set "location=%%f"
if "%%b"=="Comment" for /f "tokens=1,*" %%e in ("%%a") do set "comment=%%f"
if "%%b"=="Extended" for /f "tokens=1-4,*" %%e in ("%%a") do if "%%f"=="detected" set "extendeddetected=%%i"
)
if defined extendeddetected (
echo !server!,!printer!,!driver!,!port!,!location!,!comment!,!printerstatus!,!extendeddetected!
set "server="
set "printer="
set "driver="
set "port="
set "location="
set "comment="
set "printerstatus="
set "extendeddetected="
)
))>lak-print01.csv
attrib +h lak-print01.csv
pause
答案 2 :(得分:3)
下面的解决方案假设输入文件具有固定格式,即它有两个标题行,后跟18行的块,总是以相同的顺序放置。如果这是真的,该解决方案以非常快的方式生成输出;否则,必须相应修改......
@echo off
setlocal EnableDelayedExpansion
rem Create the array of variable names for the *desired rows* of data in the file
set "row[1]=Server name"
set "row[2]=Printer name"
set "row[4]=Driver name"
set "row[5]=Port name"
set "row[6]=Comment"
set "row[7]=Location"
set "row[15]=Printer status"
set i=0
(for /F "skip=2 delims=" %%a in (lak-print01.txt) do (
set /A i+=1
if defined row[!i!] (
set "line=%%a"
for %%i in (!i!) do for /F "delims=" %%v in ("!row[%%i]!") do set "%%v=!line:*%%v =!"
)
if !i! equ 18 (
echo !Server name!,!Printer name!,!Driver name!,!Port name!,!Location!,!Comment!,!Printer status!
set i=0
)
)) > lak-print01.csv