Swift-Rss项目 - 无法在CDATA块中获取img src链接

时间:2015-11-13 09:49:46

标签: regex xml swift rss cdata

我有一个非常讨厌的问题。我正在为Swift开发一个RSS阅读器(使用Xcode 7.1)。我希望我的tableview的每个单元格显示每个新闻的图像。这是我的代码:

     cell.itemImageView.image = UIImage(named: "placeholder")
     let news = items[indexPath.row] as MWFeedItem?
        if news?.content != nil {

        let htmlContent = news!.content as NSString
        var imageSource = ""

        let rangeOfString = NSMakeRange(0, htmlContent.length)
        let regex = try? NSRegularExpression(pattern: "(<img.*?src=\")(.*?)(\".*?>)", options: [])

        if htmlContent.length > 0 {
            let match = regex?.firstMatchInString(htmlContent as String, options: [], range: rangeOfString)

            if match != nil {
                let imageURL = htmlContent.substringWithRange(match!.rangeAtIndex(2)) as NSString
                print(imageURL)

                if NSString(string: imageURL.lowercaseString).rangeOfString("feedburner").location == NSNotFound {
                    imageSource = imageURL as String

                               }

                                }
        }

        if imageSource != "" {
            cell.itemImageView.setImageWithURL(NSURL(string: imageSource)!, placeholderImage: UIImage(named: "placeholder"))
       }
   else{
           cell.itemImageView.image = UIImage(named: "placeholder")
       }

    }

所以,问题在于:当rss feed xml文件没有CDATA块时,我的代码工作正常;在其他大多数情况下它不起作用,因为在xml文件中有一个像这样的结构:

<![CDATA[<p><a href="http://firenze.repubblica.it/cronaca/2015/11/12/news/la_denuncia_dei_genitori_di_una_scuola_di_firenze_la_mostra_divina_bellezza_vietata_ai_bambini_-127167480/?rssimage"> <img src="http://www.repstatic.it/content/nazionale/img/2015/11/12/115530091-51ce67c2-7b38-41c1-8aa5-21d51b157335.jpg" width="140" align="left" hspace="10"></a>I genitori contro la scelta del consiglio interclasse delle terze elementari dell'istituto Matteotti di fermare la gita all'esposizione "Divina Bellezza" sul...</p>]]></description><guid isPermaLink="true"><!

很明显,CDATA块不允许我读取img src链接。我能做什么? 提前感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

我使用你的正则表达式在PlayGround中运行以下代码,并成功从xml中获取了所有img src url。

import Foundation

let url = NSURL(string: "http://www.repubblica.it/rss/homepage/rss2.0.xml")!
let xml = try String(contentsOfURL: url)
let regex = try NSRegularExpression(pattern: "(<img.*?src=\")(.*?)(\".*?>)", options: [])
let range = NSMakeRange(0, xml.characters.count)
regex.enumerateMatchesInString(xml, options: [], range: range) { (result, _, _) -> Void in
  let nsrange = result!.rangeAtIndex(2)
  let start = xml.startIndex.advancedBy(nsrange.location)
  let end = start.advancedBy(nsrange.length)
  print(xml[start..<end])
}