在Python中使用ETree保存XML。它不保留命名空间,并添加ns0,ns1和删除xmlns标记

时间:2015-08-04 09:31:42

标签: python xml lxml elementtree

我看到这里有类似的问题,但没有什么能够完全帮助我。 我还查看了有关命名空间的官方文​​档,但找不到任何真正帮助我的文档,也许我只是对XML格式化方面的新手。 我明白也许我需要创建自己的命名空间字典?无论哪种方式,这是我的情况:

我从API调用中得到一个结果,它给了我一个XML,它在我的Python应用程序中存储为一个字符串。

我想要完成的只是抓取这个XML,交换一个小值(b:字符串值用户ConditionValue / Default但这与此问题无关) 然后将其保存为字符串,以便稍后在Rest POST调用中发送。

源XML看起来像这样:

<Context xmlns="http://Test.the.Sdk/2010/07" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<xmlns i:nil="true" xmlns="http://schema.test.org/2004/07/Test.Soa.Vocab" xmlns:a="http://schema.test.org/2004/07/System.Xml.Serialize"/>
<Conditions xmlns:a="http://schema.test.org/2004/07/Test.Soa.Vocab">
    <a:Condition>
        <a:xmlns i:nil="true" xmlns:b="http://schema.test.org/2004/07/System.Xml.Serialize"/>
        <Identifier>a23aacaf-9b6b-424f-92bb-5ab71505e3bc</Identifier>
        <Name>Code</Name>
        <ParameterSelections/>
        <ParameterSetCollections/>
        <Parameters/>
        <Summary i:nil="true"/>
        <Instance>25486d6c-36ba-4ab2-9fa6-0dbafbcf0389</Instance>
        <ConditionValue>
            <ComplexValue i:nil="true"/>
            <Text i:nil="true" xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays"/>
            <Default>
                <ComplexValue i:nil="true"/>
                <Text xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
                    <b:string>NULLCODE</b:string>
                </Text>
            </Default>
        </ConditionValue>
        <TypeCode>String</TypeCode>
    </a:Condition>
    <a:Condition>
        <a:xmlns i:nil="true" xmlns:b="http://schema.test.org/2004/07/System.Xml.Serialize"/>
        <Identifier>0af860f6-5611-4a23-96dc-eb3863975529</Identifier>
        <Name>Content Type</Name>
        <ParameterSelections/>
        <ParameterSetCollections/>
        <Parameters/>
        <Summary i:nil="true"/>
        <Instance>6364ec20-306a-4cab-aabc-8ec65c0903c9</Instance>
        <ConditionValue>
            <ComplexValue i:nil="true"/>
            <Text i:nil="true" xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays"/>
            <Default>
                <ComplexValue i:nil="true"/>
                <Text xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
                    <b:string>Standard</b:string>
                </Text>
            </Default>
        </ConditionValue>
        <TypeCode>String</TypeCode>
    </a:Condition>
</Conditions>

我的工作是交换其中一个值,保留源的整个结构,并使用它在应用程序中稍后提交POST。

我遇到的问题是,当它保存到字符串或文件时,它会完全混淆命名空间:

<ns0:Context xmlns:ns0="http://Test.the.Sdk/2010/07" xmlns:ns1="http://schema.test.org/2004/07/Test.Soa.Vocab" xmlns:ns3="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns1:xmlns xsi:nil="true" />
<ns0:Conditions>
<ns1:Condition>
<ns1:xmlns xsi:nil="true" />
<ns0:Identifier>a23aacaf-9b6b-424f-92bb-5ab71505e3bc</ns0:Identifier>
<ns0:Name>Code</ns0:Name>
<ns0:ParameterSelections />
<ns0:ParameterSetCollections />
<ns0:Parameters />
<ns0:Summary xsi:nil="true" />
<ns0:Instance>25486d6c-36ba-4ab2-9fa6-0dbafbcf0389</ns0:Instance>
<ns0:ConditionValue>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text xsi:nil="true" />
<ns0:Default>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text>
<ns3:string>NULLCODE</ns3:string>
</ns0:Text>
</ns0:Default>
</ns0:ConditionValue>
<ns0:TypeCode>String</ns0:TypeCode>
</ns1:Condition>
<ns1:Condition>
<ns1:xmlns xsi:nil="true" />
<ns0:Identifier>0af860f6-5611-4a23-96dc-eb3863975529</ns0:Identifier>
<ns0:Name>Content Type</ns0:Name>
<ns0:ParameterSelections />
<ns0:ParameterSetCollections />
<ns0:Parameters />
<ns0:Summary xsi:nil="true" />
<ns0:Instance>6364ec20-306a-4cab-aabc-8ec65c0903c9</ns0:Instance>
<ns0:ConditionValue>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text xsi:nil="true" />
<ns0:Default>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text>
<ns3:string>Standard</ns3:string>
</ns0:Text>
</ns0:Default>
</ns0:ConditionValue>
<ns0:TypeCode>String</ns0:TypeCode>
</ns1:Condition>
</ns0:Conditions>

我已将代码缩小到最基本的形式,我仍然得到相同的结果,所以这与我正常操作文件无关:

import xml.etree.ElementTree as ET
import requests

get_context_xml = 'http://localhost/testapi/returnxml' #returns first XML example above.
source_context_xml = requests.get(get_context_xml)

Tree = ET.fromstring(source_context_xml)

#Ensure the original namespaces are intact.
for Conditions in Tree.iter('{http://schema.test.org/2004/07/Test.Soa.Vocab}Condition'): 
    print "success"

with open('/home/memyself/output.xml','w') as f:
    f.write(ET.tostring(Tree))

2 个答案:

答案 0 :(得分:12)

在执行08-04 15:19:47.038 22144-22144/com.example.abhishek.fragmentmodularui E/AndroidRuntime﹕ FATAL EXCEPTION: main java.lang.RuntimeException: Unable to start activity ComponentInfo{com.example.abhishek.fragmentmodularui/com.example.abhishek.fragmentmodularui.MainActivity}: android.view.InflateException: Binary XML file line #8: Error inflating class fragment at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2107) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2132) at android.app.ActivityThread.access$700(ActivityThread.java:140) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1238) at android.os.Handler.dispatchMessage(Handler.java:99) at android.os.Looper.loop(Looper.java:137) at android.app.ActivityThread.main(ActivityThread.java:4918) at java.lang.reflect.Method.invokeNative(Native Method) at java.lang.reflect.Method.invoke(Method.java:511) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:994) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:761) at dalvik.system.NativeStart.main(Native Method) Caused by: android.view.InflateException: Binary XML file line #8: Error inflating class fragment at android.view.LayoutInflater.createViewFromTag(LayoutInflater.java:704) at android.view.LayoutInflater.rInflate(LayoutInflater.java:746) at android.view.LayoutInflater.inflate(LayoutInflater.java:489) at android.view.LayoutInflater.inflate(LayoutInflater.java:396) at android.view.LayoutInflater.inflate(LayoutInflater.java:352) at com.android.internal.policy.impl.PhoneWindow.setContentView(PhoneWindow.java:313) at android.app.Activity.setContentView(Activity.java:1920) at com.example.abhishek.fragmentmodularui.MainActivity.onCreate(MainActivity.java:16) at android.app.Activity.performCreate(Activity.java:5185) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1094) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2071)             at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2132)             at android.app.ActivityThread.access$700(ActivityThread.java:140)             at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1238)             at android.os.Handler.dispatchMessage(Handler.java:99)             at android.os.Looper.loop(Looper.java:137)             at android.app.ActivityThread.main(ActivityThread.java:4918)             at java.lang.reflect.Method.invokeNative(Native Method)             at java.lang.reflect.Method.invoke(Method.java:511)             at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:994)             at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:761)             at dalvik.system.NativeStart.main(Native Method) Caused by: android.app.Fragment$InstantiationException: Unable to instantiate fragment com.example.abhishek.fragments.FragmentA: make sure class name exists, is public, and has an empty constructor that is public at android.app.Fragment.instantiate(Fragment.java:584) at android.app.Fragment.instantiate(Fragment.java:552) at android.app.Activity.onCreateView(Activity.java:4828) at android.view.LayoutInflater.createViewFromTag(LayoutInflater.java:680)             at android.view.LayoutInflater.rInflate(LayoutInflater.java:746)             at android.view.LayoutInflater.inflate(LayoutInflater.java:489)             at android.view.LayoutInflater.inflate(LayoutInflater.java:396)             at android.view.LayoutInflater.inflate(LayoutInflater.java:352)             at com.android.internal.policy.impl.PhoneWindow.setContentView(PhoneWindow.java:313)             at android.app.Activity.setContentView(Activity.java:1920)             at com.example.abhishek.fragmentmodularui.MainActivity.onCreate(MainActivity.java:16)             at android.app.Activity.performCreate(Activity.java:5185)             at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1094)             at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2071)             at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2132)             at android.app.ActivityThread.access$700(ActivityThread.java:140)             at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1238)             at android.os.Handler.dispatchMessage(Handler.java:99)             at android.os.Looper.loop(Looper.java:137)             at android.app.ActivityThread.main(ActivityThread.java:4918)             at java.lang.reflect.Method.invokeNative(Native Method)             at java.lang.reflect.Method.invoke(Method.java:511)             at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:994)             at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:761)             at dalvik.system.NativeStart.main(Native Method) Caused by: java.lang.ClassNotFoundException: com.example.abhishek.fragments.FragmentA at dalvik.system.BaseDexClassLoader.findClass(BaseDexClassLoader.java:61) at java.lang.ClassLoader.loadClass(ClassLoader.java:501) at java.lang.ClassLoader.loadClass(ClassLoader.java:461) at android.app.Fragment.instantiate(Fragment.java:574)             at android.app.Fragment.instantiate(Fragment.java:552)             at android.app.Activity.onCreateView(Activity.java:4828)             at android.view.LayoutInflater.createViewFromTag(LayoutInflater.java:680)             at android.view.LayoutInflater.rInflate(LayoutInflater.java:746)             at android.view.LayoutInflater.inflate(LayoutInflater.java:489)             at android.view.LayoutInflater.inflate(LayoutInflater.java:396)             at android.view.LayoutInflater.inflate(LayoutInflater.java:352)             at com.android.internal.policy.impl.PhoneWindow.setContentView(PhoneWindow.java:313)             at android.app.Activity.setContentView(Activity.java:1920)             at com.example.abhishek.fragmentmodularui.MainActivity.onCreate(MainActivity.java:16)             at android.app.Activity.performCreate(Activity.java:5185)             at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1094)             at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2071)             at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2132)             at android.app.ActivityThread.access$700(ActivityThread.java:140)             at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1238)             at android.os.Handler.dispatchMessage(Handler.java:99)             at android.os.Looper.loop(Looper.java:137)             at android.app.ActivityThread.main(ActivityThread.java:4918)             at java.lang.reflect.Method.invokeNative(Native Method)             at java.lang.reflect.Method.invoke(Method.java:511)             at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:994)             at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:761)             at dalvik.system.NativeStart.main(Native Method) (读取xml)之前,您需要register前缀和命名空间以避免使用默认名称空间前缀(如fromstring()ns0等。)。

您可以使用ET.register_namespace()功能,示例 -

ns1

如果您不想要前缀,可以将ET.register_namespace('<prefix>','http://Test.the.Sdk/2010/07') ET.register_namespace('a','http://schema.test.org/2004/07/Test.Soa.Vocab') 留空。

示例/演示 -

<prefix>

答案 1 :(得分:0)

First off, welcome to the StackOverflow network! Technically @anand-s-kumar is correct. However there was a minor misuse of the toString function, and the fact that namespaces might not always be known by the code or the same between tags or XML files. Also, inconsistencies between the lxml and xml.etree libraries and Python 2.x and 3.x make handling this difficult.

This function iterates through all of the children elements in the XML tree tree that is passed in, and then edits the XML tags to remove the namespaces. Note that by doing this, some data may be lost.

def remove_namespaces(tree):
    for el in tree.getiterator():
        match = re.match("^(?:\{.*?\})?(.*)$", el.tag)
        if match:
            el.tag = match.group(1)

I myself just ran into this problem, and hacked together a quick solution. I tested this on about 81,000 XML files (averaging around 150 MB each) that had this problem, and all of them were fixed. Note that this isn't exactly an optimal solution, but it is relatively efficient and worked quite well for me.

CREDIT: Idea and code structure originally from Jochen Kupperschmidt.