在除正则表达式之外的所有其他字符上分割字符串

时间:2019-03-16 00:46:51

标签: swift regex

我必须将一首带有歌词的长字符串分割成一首歌,然后每行将其分割成单词。我将把这些信息保存在二维数组中。

我已经看到一些类似的问题,并且已经使用[NSRegularExpression](https://developer.apple.com/documentation/foundation/nsregularexpression)进行了解决。 但我似乎找不到任何等于“除东西以外的所有东西”的正则表达式,这是将字符串分割为单词时要分割的正则表达式。

更具体地说,我想对除字母数字或'或-以外的所有内容进行拆分。在Java中,此正则表达式为[^\\w'-]+

下面是字符串,然后是我的Swift代码以尝试完成此任务(我只是在空格上分割,而不是实际上用“ [^ \ w'-] +”来分割单词,因为我不知道如何去做。

 1 Is this the real life?
 2 Is this just fantasy?
 3 Caught in a landslide,
 4 No escape from reality.
 5 
 6 Open your eyes,
 7 Look up to the skies and see,
 8 I'm just a poor boy, I need no sympathy,
 9 Because I'm easy come, easy go,
10 Little high, little low,
11 Any way the wind blows doesn't really matter to me, to me.
12 
13 Mama, just killed a man,

(等)


let lines = s?.components(separatedBy: "\n")
var all_words = [[String]]()
for i in 0..<lines!.count {
    let words = lines![i].components(separatedBy: " ") 
    let new_words = words.filter {$0 != ""} 
    all_words.append(new_words)
 }

2 个答案:

答案 0 :(得分:1)

我建议使用[\w'-]+的反向模式来 match 所需的字符串,并使用matches matching function

您的代码如下:

for i in 0..<lines!.count {
    let new_words = matches(for: "[\\w'-]+", in: lines![i]) 
    all_words.append(new_words)
 }

下面的代码行:

print(matches(for: "[\\w'-]+", in: "11 Any way the wind blows doesn't really matter to me, to me."))

收益["11", "Any", "way", "the", "wind", "blows", "doesn\'t", "really", "matter", "to", "me", "to", "me"]

答案 1 :(得分:0)

一个简单的解决方案是先用特殊字符替换序列,然后在该字符上分割:

let words = string
    .replacingOccurrences(of: "[^\\w'-]+", with: "|", options: .regularExpression)
    .split(separator: "|")
print(words)

但是,如果可以的话,请使用系统功能枚举单词。