获取包含HTML标签XML解析Android的文本

时间:2014-12-18 12:53:43

标签: android xml parsing xml-parsing

我有像这样的XML数据

<?xml version="1.0" encoding="UTF-8"?>
<Result>
<Status>OK</Status>
   <DisplayName>Az-Zahra Madressah</DisplayName>
   <Announcements>
      <Announcement>
         <Id>46</Id>
         <Title>AZM Dates</Title>
         <Description>
            <strong>December 21st, 28th; Winter Break</strong>
            ~ January 4th: Madressah and
            <strong>Acadamy ResumesAZM</strong>
            will provide snacks every first Sunday of the month.
         </Description>
         <Attachments />
      </Announcement>
   </Announcements>
   <Resources>
      <Resource>
         <Id>26</Id>
         <Title>Quran Competition Audio FIles</Title>
         <Description>* Quran Memorization Surah Competition Audio Files
* Quran Competition Surahs List</Description>
         <Attachments>broadcast_notes/26_Sura Ad duha 93.mp3,broadcast_notes/26_Ayatul Kursi 2-255,256,257.mp3,broadcast_notes/26_Sura Al Aala 87.mp3,broadcast_notes/26_Sura Al Fatiha Hamd 1.mp3,broadcast_notes/26_Sura Al Falaq 113.mp3,broadcast_notes/26_Sura Al Balad 90.mp3,broadcast_notes/26_Sura Al Asr 103.mp3,broadcast_notes/26_Sura Al Feel 105.mp3,broadcast_notes/26_Sura Al Qadr 97.mp3,broadcast_notes/26_Sura Al Maoon 107.mp3,broadcast_notes/26_Sura Al Jumua 62.mp3,broadcast_notes/26_Sura Al Kafiroon 109.mp3,broadcast_notes/26_Sura Al Kauthar 108.mp3,broadcast_notes/26_Sura Al Qariah 101.mp3,broadcast_notes/26_Sura At Teen 95.mp3,broadcast_notes/26_Sura At Tawheed 112.mp3,broadcast_notes/26_Sura At Takathur 102.mp3,broadcast_notes/26_Sura Asshams 91.mp3,broadcast_notes/26_Sura An Nasr 110.mp3,broadcast_notes/26_Sura An Naas 114.mp3,broadcast_notes/26_Sura AlInfitar 82.mp3,broadcast_notes/26_Sura Al Quraish 106.mp3,broadcast_notes/26_Surah AlInshirah 94.mp3,broadcast_notes/26_2014 Quran Competition Surahs (1).docx</Attachments>
      </Resource>
      <Resource>
         <Id>16</Id>
         <Title>AZM Calendar</Title>
         <Description>Az Zahra Madressah 2014-15 Calendar</Description>
         <Attachments>broadcast_notes/16_AZM CALENDAR 2014_2015.xlsx</Attachments>
      </Resource>
      <Resource>
         <Id>30</Id>
         <Title>Madresaah Schedule</Title>
         <Description>Madressah and Academy Students 2014-15 Schedule</Description>
         <Attachments>broadcast_notes/30_Madressah Schedule 2014-2015.xlsx</Attachments>
      </Resource>
   </Resources>
</Result>

注意:

<Description>
<strong>December 21st, 28th; Winter Break</strong>
       ~ January 4th: Madressah and
   <strong>Acadamy ResumesAZM</strong>
  will provide snacks every first Sunday of the month.
</Description>

当我发言时,我只收到了文字*〜1月4日:Madressah和* 但我想要的是获取整个文本,包括html标签.i.e和文本。

简而言之,我想要文本

<strong>December 21st, 28th; Winter Break</strong>
       ~ January 4th: Madressah and
   <strong>Acadamy ResumesAZM</strong>
  will provide snacks every first Sunday of the month.

这是我的代码如何解析这个

Log.e("Text Desc", parser.getValue(
                            eAnnoucements, "Description"));

这是我的getValue()方法

public String getValue(Element item, String str) {
    NodeList n = item.getElementsByTagName(str);
    return this.getElementValue(n.item(0));
}

这是我的getElementValue()方法

public final String getElementValue(Node elem) {
    Node child;
    if (elem != null) {
        if (elem.hasChildNodes()) {
            for (child = elem.getFirstChild(); child != null; child = child
                    .getNextSibling()) {
                if (child.getNodeType() == Node.TEXT_NODE) {
                    return child.getNodeValue();
                }
            }
        }
    }
    return "";
}

1 个答案:

答案 0 :(得分:0)

将您的数据放入CDATA,然后在节点上使用getTextContent

<Description>
    <![CDATA[<strong>December 21st, 28th; Winter Break</strong>
                  ~ January 4th: Madressah and
                   <strong>Acadamy ResumesAZM</strong>
                   will provide snacks every first Sunday of the month.\n]]>
 </Description>

CDATA http://www.w3.org/TR/REC-xml/#sec-cdata-sect

这可以使用xml并添加CDATA部分

    NodeList list =  doc.getElementsByTagName("Description");

    if ( list.getLength() == 0 ) {
        return;
    }

    for ( int item = 0 ; item < list.getLength(); ++item ) {

        Node description = list.item(item);
        System.out.println("Contents:" + description.getTextContent() );

    }