如何仅用“ *”替换与给定数组中的单词匹配的单词的元音?

时间:2019-06-08 08:24:34

标签: regex ruby

我需要创建一个ruby方法,该方法接受一个字符串和一个数组,并且如果字符串中的任何单词与给定数组中的单词匹配,则字符串中所有匹配单词的元音都应替换为“ *”。我试图使用正则表达式和“如果条件”来做到这一点,但我不知道为什么这不起作用。如果有人可以向我解释我哪里出了问题以及如何正确编写此代码,我将不胜感激。

@objc func searchButtonTap() { 
    let name: String = searchBar.text!
    let endIndex = name.index(name.endIndex, offsetBy: -2)
    let truncated = String(name[..<endIndex])
    searchBar.text = truncated 
}

2 个答案:

答案 0 :(得分:1)

are.include? sentence.downcase读为:“如果arr的元素之一等于sentence.downcase ...”,则不是您想要的。

baddies = ["gosh", "it's", "hot", "shoot", "so"]
sentence = "Gosh, it's so very hot"

r = /\b#{baddies.join('|')}\b/i
  #=> /\bgosh|it's|hot|shoot|so\b/i 
sentence.gsub(r) { |w| w.gsub(/[aeiou]/i, '*') }
  #=> "G*sh *t's s* very h*t"

在正则表达式中,\b是一个分词符,而#{baddies.join('|')}要求匹配一个baddies。单词中断是为了避免"so""solo""possible"匹配。也可以这样写:

/\b#{Regexp.union(baddies).source}\b/
  #=> /\bgosh|it's|hot|shoot|so\b/

请参见Regexp::unionRegexp#source。之所以需要source,是因为Regexp.union(baddies)不受大小写差异修饰符(i)的影响。

另一种方法是将句子拆分为单词,操纵每个单词,然后重新结合所有片段以形成新的句子。这种方法的一个难题涉及字符"'",该字符充当双引号和单引号。考虑

sentence = "She liked  the song, 'don't box me in'"
baddies = ["don't"]

我在这里给出的方法会产生正确的结果:

r = /\b#{baddies.join('|')}\b/i
  #=> /\bdon't\b/i 
sentence.gsub(r) { |w| w.gsub(/[aeiou]/i, '*') }
  #=> "She liked the song 'd*n't box me in'" 

如果我们改为将句子分成几部分,则可以尝试以下操作:

sentence.split(/([\p{Punct}' ])/)
  #=> ["She", " ", "liked", " ", "", " ", "the", " ", "song", ",", "",
  #    " ", "", "'", "don", "'", "t", " ", "box", " ", "me", " ", "in", "'"]

如所见,正则表达式将"don't"分为"don""'t",而不是我们想要的。显然,区分单引号和撇号是一项艰巨的任务。单词可以以撇号("'twas"开头或结尾,并且所有格形式以"s"结尾的名词后面都带有撇号("Chris' car"),这使事实变得困难。

答案 1 :(得分:0)

如果条件有效,您的代码将不返回任何值。

一种选择是用空格和标点符号分割单词,进行操作,然后重新加入:

def censor(sentence, arr)
  words = sentence.scan(/[\w'-]+|[.,!?]+/) # this splits the senctence into an array of words and punctuation
  res = []
  words.each do |word|
    word = word.gsub(/[aeiou]/, "*") if arr.include? word.downcase
    res << word
  end
  res.join(' ') # add spaces also before punctuation
end


puts censor("Gosh, it's so hot", ["gosh", "hot", "shoot", "so"])
#=> G*sh , it's s* h*t

请注意,res.join(' ')也在标点符号之前添加空格。我对regexp不太满意,但这可以解决:

res.join(' ').gsub(/ [.,!?]/) { |punct| "#{punct}".strip }
#=> G*sh, it's s* h*t

这部分words = sentence.scan(/[\w'-]+|[.,!?]+/)返回["Gosh", ",", "it's", "so", "hot"]