在寻找一种将XML文件中的元素进行分组的方法时,我遇到了Muenchian Method。
来源
<file>
<patient>
<Lab_Specimen_Number>L,18.1342718.Y</Lab_Specimen_Number>
<Patient_Number>LOC0000015</Patient_Number>
<ORG/>
<Specimen/>
<Antibiotic_Amox_Ampicillin/>
</patient>
<patient>
<Lab_Specimen_Number>L,18.1342727.V</Lab_Specimen_Number>
<Patient_Number>LOC0000001</Patient_Number>
<ORG>Coliform</ORG>
<Specimen>L,18.1342727.VA</Specimen>
<Antibiotic_Amox_Ampicillin>S</Antibiotic_Amox_Ampicillin>
</patient>
<patient>
<Lab_Specimen_Number/>
<Patient_Number/>
<ORG>Staphylococcus aureus</ORG>
<Specimen>L,18.1342727.VA</Specimen>
<Antibiotic_Amox_Ampicillin>S</Antibiotic_Amox_Ampicillin>
</patient>
<patient>
<Lab_Specimen_Number>L,18.1346290.T</Lab_Specimen_Number>
<Patient_Number>LOC0000001</Patient_Number>
<ORG>Coliform</ORG>
<Specimen>L,18.1346290.TA</Specimen>
<Antibiotic_Amox_Ampicillin>S</Antibiotic_Amox_Ampicillin>
</patient>
<patient>
<Lab_Specimen_Number>L,18.1342713.X</Lab_Specimen_Number>
<Patient_Number>LOC0000009</Patient_Number>
<ORG/>
<Specimen/>
<Antibiotic_Amox_Ampicillin/>
</patient>
</file>
根据文章,我将键分配给patient[Specimen != '']
而不是patient
时更改了匹配项,因为“标本”值可能为空,并且最终输出中会丢失这些值如果仅使用patient
。
转化
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="patients-by-specimen" match="patient[Specimen != '']" use="Specimen" />
<xsl:template match="file">
<file>
<xsl:for-each select="patient[count(. | key('patients-by-specimen', Specimen)[1]) = 1]">
<patient>
<xsl:copy-of select="Lab_Specimen_Number" />
<xsl:copy-of select="Patient_Number" />
<Specimen>
<xsl:copy-of select="Specimen" />
<Organisms>
<xsl:for-each select="key('patients-by-specimen', Specimen)">
<Organism>
<xsl:copy-of select="ORG"/>
<xsl:copy-of select="Antibiotic_Amox_Ampicillin"/>
</Organism>
</xsl:for-each>
</Organisms>
</Specimen>
</patient>
</xsl:for-each>
</file>
</xsl:template>
</xsl:stylesheet>
尽管上面的转换为我提供了所需的输出,但我不完全了解此行的工作方式:
<xsl:for-each select="patient[count(. | key('patients-by-specimen', Specimen)[1]) = 1]">
有人可以在我的源文件的上下文中解释此迭代吗?
输出
<file>
<patient>
<Lab_Specimen_Number>L,18.1342718.Y</Lab_Specimen_Number>
<Patient_Number>LOC0000015</Patient_Number>
<Specimen>
<Specimen/>
<Organisms/>
</Specimen>
</patient>
<patient>
<Lab_Specimen_Number>L,18.1342727.V</Lab_Specimen_Number>
<Patient_Number>LOC0000001</Patient_Number>
<Specimen>
<Specimen>L,18.1342727.VA</Specimen>
<Organisms>
<Organism>
<ORG>Coliform</ORG>
<Antibiotic_Amox_Ampicillin>S</Antibiotic_Amox_Ampicillin>
</Organism>
<Organism>
<ORG>Staphylococcus aureus</ORG>
<Antibiotic_Amox_Ampicillin>S</Antibiotic_Amox_Ampicillin>
</Organism>
</Organisms>
</Specimen>
</patient>
<patient>
<Lab_Specimen_Number>L,18.1346290.T</Lab_Specimen_Number>
<Patient_Number>LOC0000001</Patient_Number>
<Specimen>
<Specimen>L,18.1346290.TA</Specimen>
<Organisms>
<Organism>
<ORG>Coliform</ORG>
<Antibiotic_Amox_Ampicillin>S</Antibiotic_Amox_Ampicillin>
</Organism>
</Organisms>
</Specimen>
</patient>
<patient>
<Lab_Specimen_Number>L,18.1342713.X</Lab_Specimen_Number>
<Patient_Number>LOC0000009</Patient_Number>
<Specimen>
<Specimen/>
<Organisms/>
</Specimen>
</patient>
</file>