RealmSwift:计算唯一单词的数量

时间:2020-05-30 23:20:39

标签: swift realm nspredicate

进行以下设置:

class Message: Object {

    @objc dynamic var name: String = ""
    @objc dynamic var message: String = ""
}

class Chat: Object {

    var messages = List<Message>()
    var people = List<Person>()

    @objc dynamic var name:String = ""
    @objc dynamic var path:String = ""
}

使用函数式编程来计算Message中message变量中唯一词的数量是否有更有效的方法?

let messages = Array(chat.messages)
let wordDictionary = [String:Int]()
let peopleDictionary = [String]()

for messageObject in messages {

   let words = messageObject.message.components(separatedBy: " ")
   for word in words {
      if wordDictionary[word] != nil {
          wordDictionary[word] += 1
      }else{
         wordDictionary[word] = 0
      }
   }
}

2 个答案:

答案 0 :(得分:2)

是的,您可以使用enumerateSubstrings(in:Range).byWords将句子分解为单词,并使用reduce方法计算其频率。不要忘记小写单词以确保它们不算作其他单词:


extension StringProtocol {
    var byWords: [SubSequence] { components(separated: .byWords) }
    func components(separated options: String.EnumerationOptions)-> [SubSequence] {
        var components: [SubSequence] = []
        enumerateSubstrings(in: startIndex..., options: options) { _, range, _, _ in components.append(self[range]) }
        return components
    }
}

extension Sequence where Element: Hashable {
    var frequency: [Element: Int] { reduce(into: [:]) { $0[$1, default: 0] += 1 } }
}

用法:

let sentence1 = "Given the following setup:"
let sentence2 = "Is there a more efficient way using functional programming to calculate the number of unique words within the message variable within Message"

let sentences = [sentence1, sentence2]
let frequency = sentences
    .joined(separator: "\n")
    .lowercased()
    .byWords
    .frequency

print(frequency.sorted(by: {$0.value > $1.value }))

这将打印

[[key:“ the”,value:3),(key:“ within”,value:2),(key:“ message”, 值:2),(键:“关注”,值:1),(键:“ way”,值:1),(键: “更多”,值:1),(键:“至”,值:1),(键:“计算”,值: 1),(键:“数字”,值:1),(键:“有”,值:1),(键:“ a”, 值:1),(键:“是”,值:1),(键:“唯一”,值:1),(键: “ setup”,值:1),(键:“ using”,值:1),(键:“ programming”, 值:1),(键:“给定”,值:1),(键:“单词”,值:1),(键: “变量”,值:1),(键:“功能”,值:1),(键: “有效”,值:1),(键:“ of”,值:1)]

答案 1 :(得分:0)

您要求进行函数式编程,并且使用的是.componentsSeparatedBy(“”),因此假设所有单词之间都有空格。

让我们使用.map,.flatmap和Set的功能,这些功能保证是唯一的元素。例如:

let sentence = "Every good boy does fine does" //6 total words, 5 unique
let words = sentence.components(separatedBy: " ")
let wordSet = Set(words)
print(wordSet.count)

输出是

5

然后将其应用于问题:

let messages = Array(chat.messages) //makes the Realm list a Swift Array
let x = messages.map { msgString -> [String] in
    let y = msgString.message.components(separatedBy: " ")
    return y
}
let uniqueWords = Set( x.flatMap { $0 } )
print(uniqueWords.count)

map函数将获取消息数组中的每个消息并将其分解为数组字符串数组,因此看起来像这样

[ [word0, word1, word2], [word3, word4, word5] ]

然后,flatMap将数组数组映射为单个单词数组

Finally Set接受所有单词并创建一组唯一的单词。