需要帮助来解析Python

时间:2018-04-01 14:25:35

标签: python parsing xml-parsing xlm

我有一个包含许多xml档案的文件夹,我将调用xlmstr

    <?xml version="1.0"?>
    <case>
    <name>Sharman Networks Ltd v Universal Music Australia Pty Ltd [2006] FCA 1 (5 January 2006)</name>
    <AustLII>http://www.austlii.edu.au/au/cases/cth/FCA/2006/1.html</AustLII>
    <citations>
    <citation "id=c0">

    <class>cited</class>
    <tocase>Universal Music Australia Pty Ltd v Sharman License Holdings Ltd (2005) 220 ALR 1</tocase>
    <text>2 Wilcox J delivered judgment on the complex issues of liability arising in the primary proceedings on 5 September 2005 ( Universal Music Australia Pty Ltd v Sharman License Holdings Ltd (2005) 220 ALR 1). In the meantime, Ms Hemming had filed two disclosure affidavits pursuant to Wilcox J's orders of 22 March 2005 whilst Sharman License and Sharman Networks had unsuccessfully sought several stays on various grounds of that same order insofar as it applied to them (see Universal Music Australia Pty Ltd v Sharman License Holdings Ltd [2005] FCA 406 per Hely J, delivered 8 April 2005; Universal Music Australia Pty Ltd v Sharman License Holdings Ltd [2005] FCA 441 per Wilcox J, delivered 15 April 2005 and Sharman License Holdings Ltd v Universal Music Australia Pty Ltd [2005] FCA 505 per Moore J, delivered 28 April 2005). Disclosure affidavits were eventually sworn on behalf of Sharman License and Sharman Networks by Mr Gee on 19 April 2005, which were later superseded by further affidavits sworn also by Mr Gee on 16 June 2005. Sharman License and Sharman Networks had also unsuccessfully sought an enlargement of time in which to file an application for leave to appeal from Wilcox J's orders of 22 March 2005 (see Sharman License Holdings Ltd v Universal Music Australia Pty Ltd [2005] FCA 802 per Lindgren J, delivered on 17 June 2005).</text>
    </citation>
    <citation "id=c1">
    <class>cited</class>
<text>2 Wilcox J delivered judgment on the complex issues of liability arising in the primary proceedings on 5 September 2005 ( Universal Music Australia Pty Ltd v Sharman License Holdings Ltd (2005) 220 ALR 1). In the meantime, Ms Hemming had filed two disclosure affidavits pursuant to Wilcox J's orders of 22 March 2005 whilst Sharman License and Sharman Networks had unsuccessfully sought several stays on various grounds of that same order insofar as it applied to them (see Universal Music Australia Pty Ltd v Sharman License Holdings Ltd [2005] FCA 406 per Hely J, delivered 8 April 2005; Universal Music Australia Pty Ltd v Sharman License Holdings Ltd [2005] FCA 441 per Wilcox J, delivered 15 April 2005 and Sharman License Holdings Ltd v Universal Music Australia Pty Ltd [2005] FCA 505 per Moore J, delivered 28 April 2005). Disclosure affidavits were eventually sworn on behalf of Sharman License and Sharman Networks by Mr Gee on 19 April 2005, which were later superseded by further affidavits sworn also by Mr Gee on 16 June 2005. Sharman License and Sharman Networks had also unsuccessfully sought an enlargement of time in which to file an application for leave to appeal from Wilcox J's orders of 22 March 2005 (see Sharman License Holdings Ltd v Universal Music Australia Pty Ltd [2005] FCA 802 per Lindgren J, delivered on 17 June 2005).

24 All that was referrable of course to the implications of the payment of $1,116,405.63 by Ms Hemming to TIL, following the sale of her Sydney residence on 4 February 2005; that payment appears to have been made out of the proceeds of a sale of that residence, which was effected for the gross price of $2,100,000 to a person identified by the evidence as an accountant of certain of the Sharman companies. There was no sufficiently detailed or otherwise cogent evidence as to who exercised the substantial or underlying control of decision making of TIL, or as to the basis of or reasons for such alleged indebtedness having crystallised in the first place. The state of the evidence as to the control of TIL was itself the subject of disputation before Moore J and senior counsel for the Sharman applicants sought to attribute error to his Honour's judgment for the further reason that he had failed to make a finding as to Ms Hemming's control, or otherwise, of that entity. The Sharman applicants postulated that the 'remark' made by Lindgren J at [13] of his Honour's reasons for judgment in Sharman License Holdings Ltd v Universal Music Australia Pty Ltd [2005] FCA 802 that '[Wilcox J] accepted [in the course of granting the Mareva relief on 22 March 2005] that the Sharman Companies were controlled by Ms Hemming by reason of a "client services agreement" between her and TIL dated 8 April 2002' was an 'unsure foundation for any finding of control of the Sharman trust or the Sharman companies [by Ms Hemming]', and was thus inappropriately or impermissibly relied upon by Moore J in formulating his reasons for judgment. That submission lacked merit, particularly in the light of [31] of Lindgren J's reasons for judgment in which his Honour paraphrased the two-fold acceptance, given in cross-examination by the solicitor acting for Sharman License and Sharman Networks in their application before Lindgren J, that TIL as trustee of the Sharman trust was the ultimate beneficial owner of all the shares issued in Sharman License and Sharman Networks, and moreover that Wilcox J had himself appeared to accept that in consequence of the client services agreement, Ms Hemming 'controlled the Sharman trust'.

25 The Music companies had submitted to Moore J that given the evidentiary shortcomings on a subject readily susceptible to documentary demonstration, inclusive of banking records I might add, there was in truth and reality no antecedent loan, that the transfer of those funds by Ms Hemming to TIL in Vanuatu constituted a sham transaction, and consequently that those monies remained her own property beneficially, and should have been identified and disclosed as such in her affidavit provided in the Mareva context. Once more, so it was asserted by the Sharman applicants, his Honour declined to make any concluded finding on the subject. The point is however that his Honour had been able to infer from the surrounding circumstances I have already outlined that there was some force in the Music companies' submission. But in any event his Honour was of the view that he could permit cross-examination of Ms Hemming on and in relation to those matters because at least doubt existed in relation to that area of enquiry.</text>
</citation>
<citation "id=c5">
<class>cited</class>
<tocase>D&eacute;cor Corporation Pty Ltd v Dart Industries Inc (1991) 33 FCR 397</tocase>
<text>6 Section 24(1A) of the Federal Court of Australia Act 1976 (Cth) stipulates that an appeal shall not be brought from a judgment of the Court constituted by a single judge, being a judgment that is interlocutory in nature, unless the Court or a Judge gives leave to appeal. Although s 24(1A) does not purport to qualify or limit the Court's discretion (see D&eacute;cor Corporation Pty Ltd v Dart Industries Inc (1991) 33 FCR 397 at 399 in the joint reasons for judgment of Sheppard, Burchett and Heerey JJ), the Courts have developed general principles which inform the exercise of the discretion to refuse or grant leave to appeal from an interlocutory judgment. The rationale for those principles is the public interest in the efficient administration of justice, and the maintenance of 'the integrity and vigour of the procedures of the court, including as they do, the immediate involvement of the judge at all stages in the progress of cases to trial' ( Bomanite Pty Ltd v Slatex Corp Australia Pty Ltd (1991) 104 ALR 165 at 173, per Gummow J). One consequence sought to be avoided is the expansion of expensive and delaying pre-trial litigation involved in appeals on issues of practice and procedure, and the concomitant reduction in the authority of the trial judge, should such appeals be frequently entertained ( Bomanite at 176, per French J).

 "...I am of the opinion that...there is a material difference between an exercise of discretion on a point of practice or procedure and an exercise of discretion which determines substantive rights. In the former class of case, if a tight rein were not kept upon interference with the orders of Judges of first instance, the result would be disastrous to the proper administration of justice. The disposal of cases could be delayed interminably, and costs heaped up indefinitely, if a litigant with a long purse or a litigious disposition could, at will, in effect transfer all exercises of discretion in interlocutory applications from a Judge in chambers to a Court of Appeal."



 ...It is safe to say that the question of injustice flowing from the order appealed from will generally be a relevant and necessary consideration.'</text>
</citation>
<citation "id=c6">
<class>cited</class>
<tocase>Bomanite Pty Ltd v Slatex Corp Australia Pty Ltd (1991) 104 ALR 165</tocase>
<text>6 Section 24(1A) of the Federal Court of Australia Act 1976 (Cth) stipulates that an appeal shall not be brought from a judgment of the Court constituted by a single judge, being a judgment that is interlocutory in nature, unless the Court or a Judge gives leave to appeal. Although s 24(1A) does not purport to qualify or limit the Court's discretion (see D&eacute;cor Corporation Pty Ltd v Dart Industries Inc (1991) 33 FCR 397 at 399 in the joint reasons for judgment of Sheppard, Burchett and Heerey JJ), the Courts have developed general principles which inform the exercise of the discretion to refuse or grant leave to appeal from an interlocutory judgment. The rationale for those principles is the public interest in the efficient administration of justice, and the maintenance of 'the integrity and vigour of the procedures of the court, including as they do, the immediate involvement of the judge at all stages in the progress of cases to trial' ( Bomanite Pty Ltd v Slatex Corp Australia Pty Ltd (1991) 104 ALR 165 at 173, per Gummow J). One consequence sought to be avoided is the expansion of expensive and delaying pre-trial litigation involved in appeals on issues of practice and procedure, and the concomitant reduction in the authority of the trial judge, should such appeals be frequently entertained ( Bomanite at 176, per French J)'.</text>
</citation>
<citation "id=c7">
<class>cited</class>

    <tocase>Adam P Brown Male Fashions Proprietary Limited v Phillip Morris Incorporated [1981] HCA 39 ; (1981) 148 CLR 170</tocase>
    <AustLII>http://www.austlii.edu.au/au/cases//cth/HCA/1981/39.html</AustLII>
    <text>7 At least for those reasons, this Court has held on a number of occasions that typically a party seeking leave to appeal from an interlocutory judgment ought to establish, first, that in all the circumstances, the decision from which leave is sought to appeal is attended with sufficient doubt to warrant the same being reconsidered by the Full Court, and secondly, that substantial injustice would result if such leave was to be refused, supposing the decision to have been wrong: see D&eacute;cor at 398. That those two questions were the touchstone of exercise of discretion in matters of this kind was common ground between the parties. I observe that it is well accepted that those criteria are not to be applied rigidly or fixedly, and the Court must bear in mind all of the circumstances of the particular case: see in that regard Adam P Brown Male Fashions Proprietary Limited v Phillip Morris Incorporated [1981] HCA 39 ; (1981) 148 CLR 170 at 177, where Gibbs CJ, Aickin, Wilson and Brennan JJ said:

     'For ourselves, we believe it to be unnecessary and indeed unwise to lay down rigid and exhaustive criteria. The circumstances of different cases are infinitely various. We would merely repeat, with approval, the oft-cited statement of Sir Frederick Jordan in re the Will of F B Gilbert (dec) (1946) 46 SR (NSW) 318 at 323: 



     "...I am of the opinion that...there is a material difference between an exercise of discretion on a point of practice or procedure and an exercise of discretion which determines substantive rights. In the former class of case, if a tight rein were not kept upon interference with the orders of Judges of first instance, the result would be disastrous to the proper administration of justice. The disposal of cases could be delayed interminably, and costs heaped up indefinitely, if a litigant with a long purse or a litigious disposition could, at will, in effect transfer all exercises of discretion in interlocutory applications from a Judge in chambers to a Court of Appeal."



     ...It is safe to say that the question of injustice flowing from the order appealed from will generally be a relevant and necessary consideration.'</text>
    </citation>

<citation "id=c16">
<class>cited</class>
<tocase>Cardile v LED Builders Pty Ltd [1999] HCA 18 ; (1999) 198 CLR 380</tocase>
<AustLII>http://www.austlii.edu.au/au/cases//cth/HCA/1999/18.html</AustLII>
<text>27 My reading of his Honour's reasons here was that he was far from satisfied with the nature or extent of the purported offshore structures and transactions to the extent apparent from the evidence, involving as they did the creation of a trust estate somewhat cognate to what have often been described as 'blind trusts'. Concerns of that nature appear to have persuaded or assisted to persuade the primary judge of the need to order that Ms Hemming submit to cross-examination on her disclosure affidavits. In determining to take that approach, his Honour paid regard to the relevant authorities dealing with both the grant of Mareva relief, and the making of orders ancillary to the same, including orders requiring the swearing of disclosure affidavits and cross-examination on those affidavits. After reviewing the relevant principles enunciated in those authorities, his Honour concluded at [28]: 

 '...ultimately the cautionary words of the four members of the High Court in [ Cardile v LED Builders Pty Ltd [1999] HCA 18 ; (1999) 198 CLR 380 at 403-404] set out at [18] above must be heeded. Orders made in the Court's ancillary jurisdiction must be founded on a doctrinal and principled basis. A Mareva order is protective of the Court's processes, including the efficacy of execution of those orders. Orders concerning disclosure affidavits and cross examination can, in turn, be made to render the Mareva order more efficacious. This is the touchstone for determining whether leave should be given to cross examine. A relevant consideration in determining whether leave should be given might, in an appropriate case, be the failure of the deponent of a disclosure affidavit to disclose assets completely or promptly or both. In such a case, leave might be given because doubts might arise about whether the deponent had understood and accepted the obligations and burdens imposed by the Mareva order and the ancillary order requiring the disclosure affidavit. Cross examination might be appropriate to test whether the disclosure affidavits fully revealed all assets on which the Mareva order operated and which might be available to satisfy any judgment. However, in other cases, other more significant factors might support the granting of leave to cross examine.' 

31 In my opinion, and for the reasons I have largely foreshadowed in my observations upon the submissions already recorded, the application for leave to appeal brought by the Sharman applicants has not sufficient cogency to justify the grant of any such leave. The case of the Music companies presented to the primary judge (Moore J) for relief of the nature and to the extent granted was sufficiently in line with established principle as to be clear from 'sufficient doubt'. I do not think that the United Kingdom and Australian authorities establish inflexible requirements to the extent postulated by the Sharman applicants, in particular concerning the Court's jurisdiction to grant leave to cross-examine the deponents of disclosure affidavits in Mareva contexts. His Honour's approach in particular to the issue of granting leave to the Music companies to cross-examine Ms Hemming was soundly justified in the light of the evidentiary circumstances concerning the Sharman applicants' offshore trust structure, and the circumstances of and context in which such a substantial sum of money was transferred to an offshore company in the amount and in the context that occurred.</text>
</citation>
</citations>
</case>

我想提取text标签来创建一个只包含标签内文字的新文件。到目前为止,我已尝试过以下代码:

try:
    import xml.etree.cElementTree as ET
except ImportError:
    import xml.etree.ElementTree as ET

root = ET.fromstring(xmlstr)
for page in list(root):
    content = page.find('text').text
    print(content)

执行代码时出现以下错误:

xml.etree.ElementTree.ParseError: XML or text declaration not at start of entity: line 2, column 4

我认为这是因为xml文件的第一行&#39;` 但即使我删除它,错误仍然存​​在。你可以帮帮我吗?任何建议将不胜感激。

谢谢!

4 个答案:

答案 0 :(得分:2)

您的XML字符串格式不正确。首先,您需要从开头的XML声明中删除换行符,如下所示:

xmlstr = """<?xml version="1.0"?>

(请记住使用xmlstr结束"""多行注释,类似于我们开始上面的字符串捕获的方式。)

其次,您需要更改

之类的XML属性

<citation "id=c1">

<citation id="c1">

否则您将获得格式错误的XML异常。

答案 1 :(得分:1)

这不是有效的XML。如果你眯着眼睛,它看起来有点像写得不好的HTML。输入 BeautifulSoup ,这是一个旨在清理那些最丑陋的网页的软件包。因为我不想担心编码,我让BS在原始文件上运行而不是将其读入字符串:

>>> from bs4 import BeautifulSoup
>>> with open("crud.xml", "rb") as fp:
...     soup = BeautifulSoup(fp)
... 
>>> text_nodes = soup.findAll("text")
>>> len(text_nodes)
6
>>> text_nodes[0]
<text>2 Wilcox J delivered judgment on the complex issues of...

答案 2 :(得分:0)

问题是您的XML中有一些特殊字符:

尝试使用以下代码:

false

答案 3 :(得分:0)

首先,您必须解决"id=c01"应该是id="c01"无处不在的地方:

clean_string = xmlstring.replace('"id=', 'id="')

然后你必须忘记你到那里的这个html实体

import html
clean_string = html.unescape(clean_string)   

最后你必须手动删除空白或只用.strip()删除空格,注意你还必须用find('text')替换find('.//text') - 它会找到{{1在任何嵌套级别上。或者,您可以指定文本的整个“路线”。

text

这是查找单个文本的完整代码:

root = ET.fromstring(clean_string.strip())
content = root.find('.//text').text
print(content)

但我假设你想从给定的xml文件/字符串中找到所有xmlstring = """ YOUR XML HERE """ try: import xml.etree.cElementTree as ET except ImportError: import xml.etree.ElementTree as ET clean_string = xmlstring.replace('"id=', 'id="') clean_string = html.unescape(clean_string) root = ET.fromstring(clean_string.strip()) content = root.find('.//text').text print(content) ,所以你可以这样做:

texts