
时间:2017-06-17 00:40:38

标签: python lxml pubmed


<p xmlns="https://jats.nlm.nih.gov/ns/archiving/1.0/"> 
Recently, a first step in this direction has been taken
in the form of the framework called &#8220;dynamical fingerprints&#8221;,
which has been developed to relate the experimental and MSM-derived
kinetic information.<sup><xref ref-type="bibr" rid="ref56">56</xref></sup> Several research
groups are now focused on developing protocols to systematically cross-validate
the MSM predictions and obtain MSM parameters using an optimization
protocol that produces the best estimate of the few slowest dynamics
modes of the protein dynamics.<sup><xref ref-type="bibr" rid="ref57">57</xref></sup></p>


<p xmlns="https://jats.nlm.nih.gov/ns/archiving/1.0/">
<s>Recently, a first step in this direction has been taken
in the form of the framework called &#8220;dynamical fingerprints&#8221;,
which has been developed to relate the experimental and MSM-derived
kinetic information.<sup><xref ref-type="bibr" rid="ref56">56</xref></sup> </s><s>Several research
groups are now focused on developing protocols to systematically cross-validate
the MSM predictions and obtain MSM parameters using an optimization
protocol that produces the best estimate of the few slowest dynamics
modes of the protein dynamics.<sup><xref ref-type="bibr" rid="ref57">57</xref></sup></s></p>


<s xmlns="https://jats.nlm.nih.gov/ns/archiving/1.0/">Recently, a first step in this direction has been taken
in the form of the framework called &#8220;dynamical fingerprints&#8221;,
which has been developed to relate the experimental and MSM-derived
kinetic information.<sup><xref ref-type="bibr" rid="ref56">56</xref></sup> </s>

<s xmlns="https://jats.nlm.nih.gov/ns/archiving/1.0/">Several research
groups are now focused on developing protocols to systematically cross-validate
the MSM predictions and obtain MSM parameters using an optimization
protocol that produces the best estimate of the few slowest dynamics
modes of the protein dynamics.<sup><xref ref-type="bibr" rid="ref57">57</xref></sup></s>


from lxml import etree

if __name__=="__main__":

  xml1 = '''<p xmlns="https://jats.nlm.nih.gov/ns/archiving/1.0/"> 
Recently, a first step in this direction has been taken
in the form of the framework called &#8220;dynamical fingerprints&#8221;,
which has been developed to relate the experimental and MSM-derived
kinetic information.<sup><xref ref-type="bibr" rid="ref56">56</xref></sup> Several research
groups are now focused on developing protocols to systematically cross-validate
the MSM predictions and obtain MSM parameters using an optimization
protocol that produces the best estimate of the few slowest dynamics
modes of the protein dynamics.<sup><xref ref-type="bibr" rid="ref57">57</xref></sup></p>

  print xml1

  root = etree.XML(xml1)
  sentences_info = []
  for sentence in root:
    # I want to do more fun stuff here with the result
    sentence_text = sentence.text
    ref_ids = []
    for reference in sentence.getchildren():
        if 'rid' in reference.attrib.keys():
            ref_id = reference.attrib['rid']
    sent_par = {'reference_ids': ref_ids,'text': sentence_text}
    print sent_par

2 个答案:

答案 0 :(得分:0)


'query_builder' => function(\Prfuk\WebquotaBundle\Entity\WorkplaceRepository $repository) {
        return $repository


答案 1 :(得分:0)


<Element {https://jats.nlm.nih.gov/ns/archiving/1.0/}p at 0x108219048>

您可以remove namespace from XML使用以下功能:

from lxml import etree

def remove_namespace(tree):
    for node in tree.iter():
            has_namespace = node.tag.startswith('{')
        except AttributeError:
            continue  # node.tag is not a string (node is a comment or similar)
        if has_namespace:
            node.tag = node.tag.split('}', 1)[1]


tree = etree.fromstring(xml1)
remove_namespace(tree) # remove namespace
tree.findall('sup') # output as [<Element sup at 0x1081d73c8>, <Element sup at 0x1081d7648>]