使用swift中的正则表达式从句子中提取单词

时间:2017-10-21 23:40:54

标签: regex swift nsregularexpression

我想要一个正则表达式从字符串中提取StarboyThe Weekend / Daft Punk

The Weeknd / Daft Punk - text=\"Starboy\" song_spot=\"M\" MediaBaseId=\"2238986\" itunesTrackId=\"0\" amgTrackId=\"-1\" amgArtistId=\"0\" TAID=\"744880\" TPID=\"43758958\" cartcutId=\"08

到目前为止,这是我的尝试

do {  
    let input = "The Weeknd / Daft Punk - text=\"Starboy\" song_spot=\"M\" MediaBaseId=\"2238986\" itunesTrackId=\"0\" amgTrackId=\"-1\" amgArtistId=\"0\" TAID=\"744880\" TPID=\"43758958\" cartcutId=\"0893584001\""  
    let regex = try NSRegularExpression(pattern: "text=\"(.*)", options: NSRegularExpression.Options.caseInsensitive)  
    let matches = regex.matches(in: input, options: [], range: NSRange(location: 0, length: input.utf16.count))  

    if let match = matches.first {  
        let range = match.range(at:1)  
        if let swiftRange = Range(range, in: input) {  
            let name = input[swiftRange]  
            print(name)  
        }  
    }  
} catch {  
    print("Regex was bad!")  
}  

但这给了我整个字符串

Starboy" song_spot="M" MediaBaseId="2238986" itunesTrackId="0" amgTrackId="-1" amgArtistId="0" TAID="744880" TPID="43758958" cartcutId="0893584001"

1 个答案:

答案 0 :(得分:0)

如果您需要捕获序列- text=之后的所有文本,后跟引号之间的任何单词,您可以使用此正则表达式".*(?=(text=\"[\\w\\s]+\"))"并捕获序列后的任何单词text="您可以使用此正则表达式"(?<=text=\")([\\w\\s]+)"。如果要捕获两个范围,只需使用“|”他们之间如下:

let string = """
The Weeknd / Daft Punk - text=\"Starboy\" song_spot=\"M\" MediaBaseId=\"2238986\" itunesTrackId=\"0\" amgTrackId=\"-1\" amgArtistId=\"0\" TAID=\"744880\" TPID=\"43758958\" cartcutId=\"08
"""
let pattern = ".*(?=( - text=\"[\\w\\s]+\"))|(?<=text=\")([\\w\\s]+)"

do {
    let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
    let matches = regex.matches(in: string, options: [], range: NSRange(location: 0, length: string.utf16.count))
    for match in matches {
        if let range = Range(match.range, in: string) {
            let name = string[range]
            print(name)
        }
    }
} catch {
    print("Regex was bad!")
}

这将打印

The Weeknd / Daft Punk
Starboy