如何从句子中删除单词数组?

时间:2011-11-16 06:51:42

标签: ruby-on-rails ruby

我有一系列停用词:

myArray = ["","a","ago","also","am","an","and","ani","ar","aren't","arent","as","ask","at","did","didn't","didnt","do","doe","would","be","been","best","better"]

我想从句子中删除匹配的项目:

str = 'A something and hello'

所以它变成了:

'something hello'

1。我怎么能用ruby做到这一点?

2. 我怎样才能为一个字符数组(删除所有匹配的字符)执行此操作?

这是字符数组:

["(",")","@","#","^"]

4 个答案:

答案 0 :(得分:9)

sentence = 'A something and hello'
array  = ["","a","ago","also","am","an","and","ani","ar","aren't","arent",
          "as","ask","at","did","didn't","didnt","do","doe","would",
          "be","been","best","better"]


sentence.split.delete_if{|x| array.include?(x)}.join(' ')

 => "A something hello" 

你可能想要在比较之前将所有单词放在一边,以摆脱句子开头的“A”:

sentence.split.delete_if{|x| array.include?(x.downcase)}.join(' ')

 => "something hello" 

如果你有一个字符串数组,那就更容易了:

(sentence.split - array).join(' ')
=> "A something hello"    #  but note that this doesn't catch the "A"

还删除特殊字符:

special = ["(",")","@","#","^"]

sentence.split.delete_if{|x| array.include?(x.downcase) || special.include?(x) }.join(' ')

删除单词或短语的另一种方法是:

array.each do |phrase|
  sentence.gsub!(/#{phrase}/,'')
end

答案 1 :(得分:1)

Tilo的答案的单行变体是干净且不区分大小写的(尽管它返回所有小写输出,这可能不是所有用途的理想选择):

(sentence.downcase.split - array).join(' ')

答案 2 :(得分:0)

我的解决方案:

stop_words = ["","a","ago","also","am","an","and","ani","ar","aren't","arent","as","ask","at","did","didn't","didnt","do","doe","would","be","been","best","better"]
output = %w(A something and hello) - stop_words

答案 3 :(得分:-1)

 array.map {|s| s.gsub(keyword, '')}