使用Windows命令行进行XML解析

时间:2015-04-10 04:44:44

标签: xml parsing cmd

如何从XML(大小1 GB)中提取符合某些条件的完整行到不同的新文件中?

以下是实例:

示例XML文件

 <?xml version="1.0" encoding="UTF-16"?>
<root>
<metadata xmlns="log"> <filename>QServer_64Trans_6576_20150325_1049_0.xml</filename> <date>25-03-2015</date> <executablename>E:/Program Files/Quintiq/Quintiq 4.4.0/Bin/QServer_64.exe</executablename> <originalfilename>QServer_64.exe</originalfilename> <productname>Quintiq Server 64-bit</productname> <productversion>4.4.0.13</productversion> <specialbuilddescription>Changelist: 60074</specialbuilddescription> <loggerversion>1.1</loggerversion> </metadata>
<logentries xmlns="log">
<logentry> <index>233509</index> <jobid>14442692</jobid> <status>Finished</status> <result></result> <transactionid>816712145</transactionid> <transactionkind>JobDefinition</transactionkind> <threadname>Trans_0</threadname> <threadid>5912</threadid> <starttime>12:29:50</starttime> <endtime>12:29:50</endtime> <length>00:00:00.031</length> <waitingtime>00:00:00.000</waitingtime> <proctime>00:00:00.015</proctime> <functime>00:00:00.000</functime> <dbtime>00:00:00.016</dbtime> <streamtime>00:00:00.000</streamtime> <nrdatasets>0</nrdatasets> <size>858</size> <constructions>0</constructions> <destructions>0</destructions> <changes>1</changes> <clientid>0</clientid> <ipclient></ipclient> <username>$system/SYSTEM</username> <actionelementtype>Company</actionelementtype> <actionelementname>NotifySenderRecipientAvailable</actionelementname> <actionelementkey>[706273.0.113959]</actionelementkey> <description></description> <messageid></messageid> <lockprofile>[N.[706272.0.12777530]]</lockprofile> <procmem>6744</procmem> <funcmem>126720</funcmem> <dbmem>2696</dbmem> <streammem>6288</streammem> <osvmsize></osvmsize> <freememory></freememory>  </logentry>
<logentry> <index>233510</index> <jobid>14442726</jobid> <status>Started</status> <result></result> <transactionid>816711772</transactionid> <transactionkind>Daemon</transactionkind> <threadname>Trans_1</threadname> <threadid>6208</threadid> <starttime>12:29:51</starttime> <endtime></endtime> <length></length> <waitingtime></waitingtime> <proctime></proctime> <functime></functime> <dbtime></dbtime> <streamtime></streamtime> <nrdatasets></nrdatasets> <size></size> <constructions></constructions> <destructions></destructions> <changes></changes> <clientid>0</clientid> <ipclient></ipclient> <username></username> <actionelementtype>Company</actionelementtype> <actionelementname>DaemonTimedEventHandler</actionelementname> <actionelementkey>[706273.0.88663]</actionelementkey> <description></description> <messageid></messageid> <lockprofile>[N.[706272.0.12777530]]</lockprofile> <procmem></procmem> <funcmem></funcmem> <dbmem></dbmem> <streammem></streammem> <osvmsize></osvmsize> <freememory></freememory>  </logentry>
<logentry> <index>233511</index> <jobid>14442757</jobid> <status>Started</status> <result></result> <transactionid>816712147</transactionid> <transactionkind>ExternalCallMessage</transactionkind> <threadname>Trans_0</threadname> <threadid>5912</threadid> <starttime>12:29:51</starttime> <endtime></endtime> <length></length> <waitingtime></waitingtime> <proctime></proctime> <functime></functime> <dbtime></dbtime> <streamtime></streamtime> <nrdatasets></nrdatasets> <size></size> <constructions></constructions> <destructions></destructions> <changes></changes> <clientid>1290</clientid> <ipclient>10.48.84.220</ipclient> <username>active directory/jonathangonsalvez</username> <actionelementtype>TicketOfWorkCustom</actionelementtype> <actionelementname>Update</actionelementname> <actionelementkey>[103648.0.1464376049]</actionelementkey> <description>BoundCall on INST</description> <messageid></messageid> <lockprofile>[N.[706272.23.358965396]]</lockprofile> <procmem></procmem> <funcmem></funcmem> <dbmem></dbmem> <streammem></streammem> <osvmsize></osvmsize> <freememory></freememory>  </logentry>
<logentry> <index>233512</index> <jobid>14442726</jobid> <status>Finished</status><result></result> <transactionid>816711772</transactionid> <transactionkind>Daemon</transactionkind><threadname>Trans_1</threadname> <threadid>6208</threadid> <starttime>12:29:51</starttime> <endtime>12:29:51</endtime> <length>00:00:00.078</length> <waitingtime>00:00:00.000</waitingtime> <proctime>00:00:00.047</proctime> <functime>00:00:00.000</functime> <dbtime>00:00:00.031</dbtime> <streamtime>00:00:00.002</streamtime> <nrdatasets>0</nrdatasets> <size>6595</size> <constructions>0</constructions> <destructions>1</destructions> <changes>3</changes> <clientid>0</clientid> <ipclient></ipclient> <username></username> <actionelementtype>Company</actionelementtype> <actionelementname>DaemonTimedEventHandler</actionelementname> <actionelementkey>[706273.0.88663]</actionelementkey> <description></description> <messageid></messageid> <lockprofile>[N.[706272.0.12777530]]</lockprofile> <procmem>1060016</procmem> <funcmem>126696</funcmem> <dbmem>2832</dbmem> <streammem>32688</streammem> <osvmsize></osvmsize> <freememory></freememory>  </logentry>
</logentries>
</root>

搜索条件: 的&#34; Logentry&#34;有&#34;完成&#34;现状&amp; &#34;守护进程&#34;作为交易种类。

<logentry>     <status>Finished</status>     <transactionkind>Daemon</transactionkind>

应使用Windows命令行将符合此条件的XML文件中的所有行移动到新文本文件中。

1 个答案:

答案 0 :(得分:0)

我首先要说的是#34;使用C#或XSLT&#34;或&#34; omgwtf? powershell!&#34;,但后来我读了1GB的部分。除非您可以将文件加载到RAM中,否则必须使用XmlReader

如果您对恶心感到满意,请尝试findstr /v Started input.xml > out.xml。这将使来自input.xml的所有行都不包含“已启动”一词。但是,不能保证这不会破坏XML。

供参考,这是powershell:

$xml = [xml](Get-Content .\input.xml)
$xml.Root.LogEntries.LogEntry | where { $_.status -eq "Finished" } | ft