我是clojure的新手所以请耐心等待。我有一个看起来像这样的XML
<?xml version="1.0" encoding="UTF-8"?>
<XVar Id="cdx9" Type="Dictionary">
<XVar Id="Base.AccruedPremium" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="0"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.IndexDuration" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="3.4380728252313069"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.IndexLevel01" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="30693.926279941188"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.TrancheDelta" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="8.9304387917502073"/>
</Row>
</XVar>
<XVar Id="TrancheAnalysis.TrancheDuration" Type="Multi" Value="" Rows="1" Columns="1">
<Row Id="0">
<Col Id="0" Type="Num" Value="3.0775955481964035"/>
</Row>
</XVar>
</XVar>
重复一遍。由此我希望能够生成包含这些列的CSV文件
IndexName,TrancheAnalysis.IndexDuration,TrancheAnalysis.TrancheDuration
cdx9,3.4380728252313069,3.0775955481964035
.........................................
.........................................
我能够解析像
这样的简单XML文件<?xml version="1.0" encoding="UTF-8"?>
<CalibrationData>
<IndexList>
<Index>
<Calibrate>Y</Calibrate>
<UseClientIndexQuotes>Y</UseClientIndexQuotes>
<IndexName>HYCDX10</IndexName>
<Tenor>06/20/2013</Tenor>
<TenorName>3Y</TenorName>
<IndexLevels>219.6</IndexLevels>
<Tranche>Equity0To0.15</Tranche>
<TrancheStart>0</TrancheStart>
<TrancheEnd>0.15</TrancheEnd>
<UseBreakEvenSpread>1</UseBreakEvenSpread>
<UseTlet>0</UseTlet>
<IsTlet>0</IsTlet>
<PctExpectedLoss>0</PctExpectedLoss>
<UpfrontFee>52.125</UpfrontFee>
<RunningFee>0</RunningFee>
<DeltaFee>5.3</DeltaFee>
<CentralCorrelation>0.1</CentralCorrelation>
<Currency>USD</Currency>
<RescalingMethod>PTIndexRescaling</RescalingMethod>
<EffectiveDate>06/17/2011</EffectiveDate>
</Index>
</IndexList>
</CalibrationData>
使用此代码
(ns DynamicProgramming
(:require [clojure.xml :as xml]))
;Get the Input Files
(def calibrationFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/CalibrationQuotes.xml")
(def mktdataFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/MarketData.xml")
(def sample "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/Sample.xml")
;Parse the Calibration Input File
(def CalibOp (for [x
(xml-seq
(xml/parse (java.io.File. calibrationFile)))
:when (or
(= :IndexName (:tag x))
(= :Tenor (:tag x))
(= :UpfrontFee (:tag x))
(= :RunningFee (:tag x))
(= :DeltaFee (:tag x))
(= :IndexLevels (:tag x))
(= :TrancheStart (:tag x))
(= :TrancheEnd (:tag x))
)]
(first(:content x))))
(println CalibOp)
但第二个XML很简单;另一方面,我不知道如何遍历第一个XML示例的嵌套结构并提取我想要的信息。
任何帮助都会很棒。
答案 0 :(得分:8)
我会使用data.zip(以前的clojure.contrib.zip-filter)。它提供了大量的xml解析功能,并且很容易执行类似xpath的表达式。 README将其描述为用于过滤树的系统,特别是XML树。
下面我有一些示例代码,用于为CSV文件创建“行”。该行是列名称到属性值的映射。
(ns work
(:require [clojure.xml :as xml]
[clojure.zip :as zip]
[clojure.contrib.zip-filter.xml :as zf]))
; create a zip from the xml file
(def zip (zip/xml-zip (xml/parse "data.xml")))
; pulls out a list of all of the root "Id" attribute values
(zf/xml-> zip (zf/attr :Id))
(defn value [xvar-zip]
"Finds the id and value for a particular element"
(let [id (-> xvar-zip zip/node :attrs :Id) ; manual access
value (zf/xml1-> xvar-zip ; use xpath like expression to pull value out
:Row ; need the row element
:Col ; then the column element
(zf/attr :Value))] ; and finally pull the Value out
{id value}))
; gets the "column-value" pair for a single column
(zf/xml1-> zip
(zf/attr= :Id "cdx9") ; filter on id "cdx9"
:XVar ; filter on XVars under it
(zf/attr= :Id "TrancheAnalysis.IndexDuration") ; filter on id
value) ; apply the value function on the result of above
; creates a map of every column key to it's corresponding value
(apply merge (zf/xml-> zip (zf/attr= :Id "cdx9") :XVar value))
我不确定xml如何与多个Dictionary XVar一起使用,因为它是一个根元素。如果需要,对此类工作有用的其他函数之一是mapcat
,其中cat
是映射函数返回的所有值。
test source中还有更多示例。
我的另一个重要建议是确保使用许多小功能。您会发现调试,测试和使用起来更容易。