使用python将XML元数据附加到HTML文件

时间:2013-06-26 05:10:15

标签: python html xml

我从我想要添加到自定义HTML文件中的XML文件中提取元数据。

我可以从XML中提取相关信息,但无法在不覆盖以前信息的情况下获取更新信息以添加/附加到我的HTML文件。

我想为每个处理过的XML制作相同表格布局的块。我认为这可能是一个缩进问题。

import xml.etree.ElementTree as ET

html_head = """

<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title></title>
</head>"""

fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb')
fh.write(html_head)

XML_List = [r'D:\Temp\file1.xml', r'D:\Temp\file2.xml']
for xml in XML_List:
    print xml + '\n'
    path = xml
    tree = ET.parse(path)

    for node in tree.findall('.//title'):
        title = node.text
        print 'Title: ' + node.text

    for node in tree.findall('.//westbc'):
        westbc = node.text
        print 'West: ' + node.text

    for node in tree.findall('.//eastbc'):
        eastbc = node.text
        print 'East: ' + node.text

    for node in tree.findall('.//northbc'):
        northbc = node.text
        print 'North: ' + node.text

    for node in tree.findall('.//southbc'):
        southbc = node.text
        print 'South: ' + node.text

    for node in tree.findall('.//geogunit'):
        geogunit = node.text
        print 'Geographic Units: ' + node.text

    for node in tree.findall('.//horizdn'):
        horizdn = node.text
        print 'Projection: ' + node.text

    for node in tree.findall('.//ellips'):
        ellips = node.text
        print 'Ellipsoid: ' + node.text

        html_body = """

        <body>
        <p>&nbsp;</p>
        <table width="800" border="0">
          <tr>
            <td width="309" rowspan="5"><img src="Thumbs/img.jpg" alt="" width="300" height="300" align="left"></td>
            <td width="4" rowspan="5">&nbsp;</td>
            <td height="50" colspan="3">Title: """ + title + """</td>
          </tr>
          <tr>
            <td width="150" height="50">&nbsp;</td>
            <td width="165" height="50">North: """ + northbc + """</td>
            <td width="150" height="50">&nbsp;</td>
          </tr>
          <tr>
            <td height="50">West: """ + westbc + """</td>
            <td height="50">&nbsp;</td>
            <td height="50">East: """ + eastbc + """</td>
          </tr>
          <tr>
            <td height="50">&nbsp;</td>
            <td height="50">South: """ + southbc + """</td>
            <td height="50">&nbsp;</td>
          </tr>
          <tr>
            <td height="150" colspan="3"><p>Geographic Units: """ + geogunit + """</p>
            <p>Projection: """ + horizdn + """</p>
            <p>Ellipsoid: """ + ellips + """</p></td>
          </tr>
        </table>
        <p>&nbsp;</p>
        </body>"""


        fh = open(r'D:\Temp\CSD_HTML\file.html', 'at') ## Remove this line
        fh.write(html_body)

html_tail = """

</html>"""

fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb') ## Remove this line
fh.write(html_tail)
fh.close()

del tree

非常感谢您的建议和指导。

抱歉,能够回答我自己的问题。需要删除多个引用:

fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb')

只需在代码开头处引用一次文件即可进行附加工作。

0 个答案:

没有答案