找到&使用正则表达式替换字符串

时间:2017-06-08 08:49:21

标签: swift regex swiftsoup

我需要从一个带有POST请求的网站返回的字符串中提取数据;我正在使用SwiftSoup库解析数据。我使用CSS选择器选择了列表项:

.pollEnrich("smb://domain;login:pwd@host/dir?password=pwd&preMove=backup&move=processed&moveFailed=error&charset=UTF-8", 1000)

返回如下的html:

let iconsList: Element = try doc.select("ul.icons-list").first()!

现在我需要提取标签和值并存储在数组内部或可能是单独的变量。我试过正则表达式如下所示(不起作用,也许是错误的正则表达式):

<ul class="icons-list"> 
   <li><strong>Label 1:</strong> Value 1 (Some text) </li> 
   <li><strong>Label 2:</strong> Value 2</li> 
   <li><strong>Label 3:</strong> Value 3</li> 
   <li><strong>Label 4:</strong> Value 4 </li> 
   <li><strong>Label 5:</strong> Value 5</li> 
</ul>

还尝试了SwiftSoup选择器:

let result = "This <strong>Needs to be removed</strong> is my string"
let regex = try! NSRegularExpression(pattern: "<strong>(.*)</strong>", options: .caseInsensitive)
var newStr = regex.stringByReplacingMatches(in: result, options: [], range: NSRange(0..<str.utf16.count), withTemplate: "")
print(newStr)

但它也会返回HTML结果。所以,我需要在两种情况下都使用正则表达式。怎么办呢?

另一个问题: 当我使用SwiftSoup“.select”选择器选择图标列表类时。如果有例外,我该如何处理?目前,我有这个代码,但它不起作用。如果我想在这个块中处理多个try块怎么办?

var labelFirst = try doc.select("ul.icons-list li:nth-child(1)")

1 个答案:

答案 0 :(得分:0)

我能够弄清楚自己。以下是我的表现:

var res = "<ul class=\"icons-list\"><li><strong>Label 1:</strong> Value 1 (Some text) </li></ul>"

extension String {
  func capturedGroups(withRegex pattern: String) -> [String] {
    var results = [String]()

    var regex: NSRegularExpression
    do {
        regex = try NSRegularExpression(pattern: pattern, options: [])
    } catch {
        return results
    }

    let matches = regex.matches(in: self, options: [], range: NSRange(location:0, length: self.characters.count))

    guard let match = matches.first else { return results }

    let lastRangeIndex = match.numberOfRanges - 1
    guard lastRangeIndex >= 1 else { return results }

    for i in 1...lastRangeIndex {
        let capturedGroupIndex = match.rangeAt(i)
        let matchedString = (self as NSString).substring(with: capturedGroupIndex)
        results.append(matchedString)
    }

    return results
  }
}

let label1 = res.capturedGroups(withRegex: "<strong>(.*)</strong>")
let value1 = res.capturedGroups(withRegex: "</strong>(.*)</li>")

print("\(label1[0]): \(value1[0])")
//Output: Label 1:  Value 1 (Some text) 

如果有人给我更好的方法或改进我的功能,我仍然会感激!