swift rss从编码的内容中提取解析提取的href

时间:2016-08-10 14:25:04

标签: regex swift parsing rss nsregularexpression

我有一个看起来像这样的常规RSS Feed

<content:encoded><![CDATA[<div style="width: 1280px; " class="wp-video"><video class="wp-video-shortcode" id="video-54267-2" width="1280" height="720" preload="metadata" controls="controls"><source type="video/mp4" src="http://www.thestreetlede.com/wp-content/uploads/2016/05/bJFWyEWkx5Ri.mp4?_=2" /><a href="http://www.thestreetlede.com/wp-content/uploads/2016/05/bJFWyEWkx5Ri.mp4">http://www.thestreetlede.com/wp-content/uploads/2016/05/bJFWyEWkx5Ri.mp4</a></video></div>
]]></content:encoded>

我的目标是从此<a href="http://www.thestreetlede.com/wp-content/uploads/2016/05/bJFWyEWkx5Ri.mp4">中提取“http ..”链接 在swift中使用正则表达式但是我对这个概念很新,我的代码不会提取视频但是我有一个正则表达式来处理图像这里是我的代码 对于图像: let regex = "['\"][^'|^\"]*?(?:png|jpg|jpeg|gif)[^'|^\"]*?['\"]"

我需要帮助的部分(视频) :let regex = "['\"][^'|^\"]*?(?:mp4)[^'|^\"]*?['\"]" 这会返回:[]每次我都不确定如何修复正则表达式 如果我弄清楚我的错误,我会更新,任何帮助表示赞赏

更新

我创建了这个功能,但它仍然无法正常工作有没有人看到问题或者能指出我正确的方向

func getImageURLsFromContent(content: String) -> [String] {
    let regex = "<a\\s+href=\"([^\"]+)\""
    var substr = content
    var result: [String] = []
    while let match = substr.rangeOfString(regex, options: [.RegularExpressionSearch, .CaseInsensitiveSearch]) {
        var matchingString = substr.substringWithRange(match)
        matchingString = matchingString.substringFromIndex(matchingString.startIndex.successor())
        matchingString = matchingString.substringToIndex(matchingString.endIndex.predecessor())

        if matchingString.rangeOfString("http:", options: .CaseInsensitiveSearch) != nil || matchingString.rangeOfString("https://", options: .CaseInsensitiveSearch) != nil {
            result.append(matchingString)
        }
        substr = substr.substringFromIndex(match.startIndex.advancedBy(matchingString.characters.count))
    }
    print(result)
    return result
}

0 个答案:

没有答案