我正在尝试在HTML字符串中找到一些文本正则表达式匹配项,并用特殊标记替换匹配项。在下面的示例字符串中,我想找到单词swiftsoup
,并将其替换为<b>swiftsoup</b>
,但排除所有属性中的所有匹配项,例如id="swiftsoup"
或href
url中的所有匹配项。
// example string
<p>swiftsoup is awesome, but I don't know how to solve with <a id="swiftsoup" href="https://github.com/scinfu/swiftsoup">swiftsoup</a> or other. Love swiftsoup even so.</p>
下面的SwiftSoup代码当然是行不通的,因为ownText()
至text()
不是变异函数,无法处理replacingOccurrences(of:with:)
的未使用结果:
let h = #"<p>swiftsoup is awesome, but I don't know how to solve with <a id="swiftsoup" href="https://github.com/scinfu/swiftsoup">swiftsoup</a> or other. Love swiftsoup even so.</p>"#
let p = try! SwiftSoup.parse(h).select("p").first()!
p.ownText().replacingOccurrences(of: "swiftsoup", with: "<b>swiftsoup</b>")
^~~~~~
也许带有html()
的正则表达式可能会有所帮助,但我不知道如何在属性值内保留匹配项:
extension String {
func markUpSwiftSoup() -> String {
var selfResult = self
let selfAsNSString = self as NSString
if let regex = try? NSRegularExpression(pattern: "swiftsoup") {
let range = NSRange(location: 0, length: selfAsNSString.length)
regex.matches(in: self, options: [], range: range).forEach {
let match = selfAsNSString.substring(with: $0.range)
selfResult = selfResult.replacingOccurrences(of: match, with: "<b>\(match)</b>")
}
return selfResult
} else {
return self
}
}
}
var pHTML = try! p.html()
p.html(pHTML.markUpSwiftSoup())
我尝试获得的结果是:
<p><b>swiftsoup</b> is awesome, but I don't know how to solve with <a id="swiftsoup" href="https://github.com/scinfu/swiftsoup"><b>swiftsoup</b></a> or other. Love <b>swiftsoup</b> even so.</p>
谢谢!