irb(main):021:0> TEXT = <<EOF
irb(main):022:0" Text messaging, or texting, is the act of composing
and sending electronic messages between two or more mobile phones, or fixed or
portable devices over a phone network. The term originally referred to messages
sent using the Short Message Service (SMS).
It has grown to include multimedia messages(known as MMS) containing images, videos,
and sound content, as well as ideograms known as emoji
irb(main):023:0" EOF
=> "Text messaging, or texting, is the act of composing and sending electronic
messages between two or more mobile phones, or fixed or portable devices over a
phone network. The term originally referred to messages sent using the Short
Message Service (SMS). It has grown to include multimedia messages (known as MMS)
containing images, videos, and sound content, as well as ideograms known as emoji\n"
irb(main):064:0> LEXICON = {
irb(main):065:1* "Text" => "WRITING",
irb(main):066:1* "is" => "WAS"
irb(main):067:1> }
irb(main):070:0> sanitized = TEXT.gsub(pattern, LEXICON)
=> "WRITING messaging, or texting, WAS the act of composing and sending electronic messages between two or more mobile phones, or fixed or portable devices over a phone network. The term originally referred to messages sent using the Short Message Service (SMS). It has grown to include multimedia messages (known as MMS) containing images, videos, and sound content, as well as ideograms known as emoji\n"
irb(main):071:0> terms = LEXICON.keys.map {|k| Regexp.new(Regexp.escape(k))}.join("|")
=> "(?-mix:Text)|(?-mix:is)" <------------ What is this (?-mix:...) thing?
irb(main):072:0> sanitized = TEXT.gsub(pattern, LEXICON)
=> "WRITING messaging, or texting, WAS the act of composing and sending electronic
messages between two or more mobile phones, or fixed or portable devices over a
phone network. The term originally referred to messages sent using the Short
Message Service (SMS). It has grown to include multimedia messages (known as MMS)
containing images, videos, and sound content,
as well as ideograms known as emoji\n"
我正在观看Ruby Tapas,第190集,我在IRB
会话中尝试了它,对我来说有趣的是我会得到一个(?-mix:key)|(? - mix :key)在那个表达之后。有人可以向我解释那是什么意思吗?
由于
答案 0 :(得分:0)
这是covered in the docs。 (?-mix:...)
禁用括号内子表达式的m
,i
和x
选项。
P.S。一种更简单的方法:
LEXICON.keys.map {|k| Regexp.new(Regexp.escape(k)) }.join("|")
这是:
Regexp.union(LEXICON.keys)
假设您最终需要Regexp而不是字符串。请参阅Regexp.union
。