我正在尝试使用元素树将多个csv文件(现在两个)转换为xml,但是我没有得到确切的输出。请以更有效的方法指导我。 PS:我是这里的初学者。
import csv
import xml.etree.ElementTree as ET
#from bs4 import BeautifulSoup
root = ET.Element('Policy')
with open("policy.csv","r") as p, open("Att.csv","r") as a, open("rider.csv","r") as r:
csv_p = csv.reader(p)
header_p = next(csv_p)
csv_a = csv.reader(a)
header_a = next(csv_a)
csv_r = csv.reader(r)
header_r = next(csv_r)
for row in csv_p:
pid = row[0]
print("\n",pid)
for col in range(len(header_p)):
ET.SubElement(root, header_p[col]).text = str(row[col])
for childrow in csv_a:
if(pid == childrow[0]):
print("Match found")
child = ET.SubElement(root,"child")
for col_a in range(len(header_a)):
ET.SubElement(child, header_a[col_a]).text = str(childrow[col_a])
for tailrow in csv_r:
if(childrow[1] == tailrow[0]):
print("tail found",tailrow[0])
tail = ET.SubElement(child,"tail")
for col_r in range(len(header_r)):
ET.SubElement(tail, header_r[col_r]).text = str(tailrow[col_r])
r.seek(0)
a.seek(0)
tree = ET.tostring(root, encoding="UTF-8")
#print(BeautifulSoup(tree, "xml").prettify())
with open("Output.xml", "wb") as f:
f.write(tree)
with open('Output.xml', 'r') as f:
print("\n\n",f.read())
输出如下所示,但是您会看到一些标签被重复,因为它们在我正在读取的文件中是多余的:
Policy.csv:
Pid,Name,Date
101,Life In,3Jan2017
102,Mobile,8Aug2018
Att.csv:
PId,AId,Name
101,9001,Pune
101,9002,Mumbai
102,9003,Delhi
rider.csv:
AId,RID,Name
9001,10001,Ramesh
9001,10002,Suresh
9002,10003,Rahul
9002,10004,Kirti
输出:
<Policy>
<Pid>101</Pid>
<child>
<PId>101</PId>
<tail><AId>9001</AId>
<RID>10001</RID>
<Name>Ramesh</Name>
</tail>
<tail>
<AId>9001</AId>
<RID>10002</RID>
<Name>Suresh</Name>
</tail>
<AId>9001</AId>
<Name>Pune</Name>
</child>
<child>
<PId>101</PId>
<tail><AId>9002</AId>
<RID>10003</RID>
<Name>Rahul</Name>
</tail>
<tail><AId>9002</AId>
<RID>10004</RID>
<Name>Kirti</Name>
</tail>
<AId>9002</AId>
<Name>Mumbai</Name>
</child>
<Name>Life In</Name>
<Date>3Jan2017</Date>
</Policy>
所需输出实例:
<Policy>
<Pid>101</Pid>
<child>
<AId>9001</AId>
<tail>
<RID>10001</RID>
<Name>Ramesh</Name>
</tail>
<tail>
<RID>10002</RID>
<Name>Suresh</Name>
</tail>
<Name>Pune</Name>
</child>
<Name>Life In</Name>
<Date>3Jan2017</Date>
</Policy>
答案 0 :(得分:0)
如果您能够使用lxml,这是我在评论中正在谈论的示例。
希望我的逻辑正确:
<table id="editable_table" class="table table-striped table-sm">
<thead>
<tr>
<th class='th' id=0>Skill</th>
<th class='th' id=1>Departmental Average</th>
<th class='th' id=2>Employee</th>
</tr>
</thead>
<tbody id="tableData">
<tr>
<td>
Skill 1
</td>
<td>
<input type="number" class="form-control" id="depAverage1" placeholder="">
</td>
<td>
<input type="number" class="form-control" id="employee1" placeholder="">
</td>
<td>
<button class="btn btn-primary btn-sm">Update</button>
</td>
</tr>
.
.
.
.
<tr>
<td>
Skill 7
</td>
<td>
<input type="number" class="form-control" id="depAverage7" placeholder="">
</td>
<td>
<input type="number" class="form-control" id="employee7" placeholder="">
</td>
<td>
<button class="btn btn-primary btn-sm">Update</button>
</td>
</tr>
</tbody>
</table>
<button class="btn btn-primary btn-lg pull-right">SAVE</button>
</div>
<div class=col-md-5>
<canvas id="myChart"></canvas>
</div>
</div>
</div>
</body>
</html>
<script>
var depAverage1 = document.getElementById("depAverage1").value;
.
.
.
var depAverage7 = document.getElementById("depAverage7").value;
var employee1 = document.getElementById("depAverage1").value;
.
.
.
var employee7 = document.getElementById("depAverage7").value;
var ctx = document.getElementById("myChart").getContext('2d');
var myChart = new Chart(ctx, {
type: 'radar',
data: {
labels: ["Red", "Blue", "Yellow", "Green", "Purple", "Orange"],
datasets: [{
label: '# of Votes',
data: [12, 19, 3, 5, 2, 3],
.
.
.
borderWidth: 1
}]
},
options: {
scales: {
yAxes: [{
ticks: {
beginAtZero:true
}
}]
}
}
});
</script>
基于Policy.csv中的一行。由policy
唯一标识。Pid
中的child
基于Att.csv中具有匹配的policy
的行。PId
中的tail
基于rider.csv中具有匹配的child
的一行。我要做的第一件事是将csv转换为临时XML格式。
由于csv文件的标题值将是有效的元素名称,因此我将继续根据这些值创建元素。
如果您的csv文件中的标头值可能不是有效的元素名称,则可以使用通用元素名称并将标头值存储在属性中。 (如果需要,我可以更改示例。)
然后,我将转换临时XML并处理那里的所有分组。由于lxml仅支持XSLT 1.0,因此我们必须使用Muenchian Grouping。
示例...
Python
AId
XSLT (transform.xsl)
import csv
from os import path
from lxml import etree
def csv2xml(file):
result = etree.Element(path.splitext(file)[0])
with open(file) as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
row_elem = etree.SubElement(result, "row")
for entry in row:
entry_elem = etree.SubElement(row_elem, entry.strip().lower())
entry_elem.text = row.get(entry).strip()
return result
csv_files = ["policy.csv", "att.csv", "rider.csv"]
temp_xml = etree.Element("policies")
for csv_file in csv_files:
xml = csv2xml(csv_file)
temp_xml.append(xml)
xslt = etree.parse("transform.xsl")
xml_output = etree.ElementTree(temp_xml).xslt(xslt)
print(etree.tostring(xml_output, pretty_print=True).decode())
Python将打印以下输出:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="policy" match="policy/row" use="pid"/>
<xsl:key name="att" match="att/row" use="pid"/>
<xsl:key name="rider" match="rider/row" use="aid"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<xsl:copy>
<xsl:apply-templates select="policy"/>
</xsl:copy>
</xsl:template>
<xsl:template match="policy">
<xsl:for-each select="row[count(.|key('policy', pid)[1])=1]">
<policy>
<xsl:apply-templates select="pid"/>
<xsl:apply-templates select="key('att', pid)"/>
<xsl:apply-templates select="name|date"/>
</policy>
</xsl:for-each>
</xsl:template>
<xsl:template match="att/row">
<child>
<xsl:apply-templates select="aid"/>
<xsl:apply-templates select="key('rider', aid)"/>
<xsl:apply-templates select="name"/>
</child>
</xsl:template>
<xsl:template match="rider/row">
<tail>
<xsl:apply-templates select="rid|name"/>
</tail>
</xsl:template>
</xsl:stylesheet>
希望这会有所帮助。