使用拆分解析RSS提要

时间:2012-11-09 13:41:52

标签: vb.net parsing rss split cdata

嘿所有我想解析下面的CDATA描述:

 <description>
   <![CDATA[
     <p><b>Submission Date :</b> 2012-11-07 16:53:27<br/> <b>IP Address :</b> xx.xxx.xxx.xx<br/> <b>First Name :</b> dev<br/> <b>Email :</b> test3@here.com<br/> <b>18 yrs./Older :</b> YES<br/> <b>xxxxOffers :</b> <br/> </p>
   ]]>
</description>

阅读时看起来像这样:

Submission Date : 2012-11-07 16:53:27
IP Address : xx.xx.xx.xx
First Name : dev
Email : test3@here.com
18 yrs./Older : YES
xxxxOffers : [space here]

目前我执行以下操作:

description = description.Replace("<p>", "").Replace("</p>", "").Replace("<b>", "").Replace("</b>", "").Replace("<br/>", "")
Dim descriptionArray() As String = Split(description, " : ")

产生以下结果:

enter image description here

应该按原样分解:

(0)Submission Date
(1)2012-11-07 16:53:27
(2)IP Address
(3)xx.xxx.xxx.xx
(4)First Name
(5)dev
(6)Email
(7)test3@here.com
(8)18 yrs./Older
(9)YES
(10)xxxxOffers
(11)[space here]

我似乎无法找到一种方法将CDATA分成每个值,而不使用分支为“:”的SPLIT,这使得它从那时起就完全没有了( 16:53:27 )已经有“:”

所以我试图通过检查“:”来支持这一点,但仍然没有给我所需的结果。

1 个答案:

答案 0 :(得分:1)

我建议你在字典导向对象中获取信息。因此,您将在一次通话中获得描述和价值。例如,您可以执行以下操作:

Dim ht as new HashTable   
' by <br>, you will have  : <p><b>Submission Date :</b> 2012-11-07 16:53:27 as first line
For each inLine as string in split(description,"<br/>")
    '<p><b>Submission Date : and 2012-11-07 16:53:27
    dim keyValue as string = split(inLine,"</b>")
    'then clean up <p>,<b>,.... remaining either by "<" and ">" or full tag
    'add(key,value)
    ht.add(keyValue(0), keyValue(1))
Next

如果出于任何原因您不想使用可枚举对象,则可以将其用作基线。