在python中解析xml文件-找不到元素

时间:2018-10-26 17:15:14

标签: python xml

我是python初学者。

我希望能够选择xml工作表中某些元素的值。以下是我的xml工作表:

  <TempFolder>D:\Mooniology\DiSecTemp\160708_M02091_0202_000000000-APC99</TempFolder>
  <AnalysisFolder>D:\Mooniology\MiSeqAnalysis\160708_M0209831_0202_000000000-APC99</AnalysisFolder>
  <RunStartDate>160708</RunStartDate>
  <MostRecentWashType>PostRun</MostRecentWashType>
  <RecipeFolder>D:\Mooniology\MiSeq Control Software\CustomRecipe</RecipeFolder>
  <ILMNOnlyRecipeFolder>C:\Mooniology\MiSeq Control Software\Recipe</ILMNOnlyRecipeFolder>
  <SampleSheetName>20160708 ALK Amplicon NGS cDNA synthesis kit comparison</SampleSheetName>
  <SampleSheetFolder>Q:\GNO MiSeq\Jaya</SampleSheetFolder>
  <ManifestFolder>Q:\GNO MiSeq</ManifestFolder>
  <OutputFolder>\\rpbns4-lab\vol10\RMSdisect\160708_M02091_0202_000000000-APC99</OutputFolder>
  <FocusMethod>AutoFocus</FocusMethod>
  <SurfaceToScan>Both</SurfaceToScan>
  <SaveFocusImages>true</SaveFocusImages>
  <SaveScanImages>true</SaveScanImages>

然后通过“选择值”,假设我想要名为TempFolder的元素的值。我想让脚本吐出D:\Mooniology\DiSecTemp\160708_M02091_0202_000000000-APC99 以下是我用来尝试对其进行扫描的代码:

#!/usr/bin/python2.7

import xml.etree.ElementTree as ET
tree = ET.parse('online.xml')
root = tree.getroot()
for child in root:
    print(child.tag, child.attrib)

每次我运行此代码时,无论我如何修改(通过研究Google),最终结果始终是以下错误:

Traceback (most recent call last):
  File "./mindo.py", line 5, in <module>
    tree = ET.parse('online.xml')
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1182, in parse
    tree.parse(source, parser)
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 657, in parse
    self._root = parser.close()
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1654, in close
    self._raiseerror(v)
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
xml.etree.ElementTree.ParseError: no element found: line 75, column 0

我怀疑问题可能是我正在使用的xml文件。但是由于我是python的新手,所以我必须假定它是我的代码。

1 个答案:

答案 0 :(得分:0)

这是因为XML格式不正确,因此无法解析:

In [4]: tree = ET.parse('online.xml')
   ...: 
  File "<string>", line unknown
ParseError: junk after document element: line 2, column 2

xml需要具有根元素,即:

  <params>
    <TempFolder>D:\Mooniology\DiSecTemp\160708_M02091_0202_000000000-APC99</TempFolder>
    <AnalysisFolder>D:\Mooniology\MiSeqAnalysis\160708_M0209831_0202_000000000-APC99</AnalysisFolder>
    <RunStartDate>160708</RunStartDate>
    <MostRecentWashType>PostRun</MostRecentWashType>
    ...
    ...
    ...
  </params>