我有一个与“ Ruby gsub multiple characters in string”几乎相同的问题。
但是,我的字符串包含特殊字符:
a = "<p>text</p> <strong>bold</strong> and <em>italic</em>"
使用/\w+/
对我不起作用。
我尝试了许多不同的组合,但没有运气。
我应该在下面输入什么正则表达式匹配项才能使其正常工作?我想替换字符串中任何位置的那些匹配项。
通过我使用Rails的方式。
我想要的比赛是:
a.gsub({{WHAT REGEX EXP?}},
"\r\n" => "",
"<p>" => "",
"</p>" => "\n\n",
"<br />" => "\n",
"<strong>" => "*",
"</strong>" => "*",
"<em>" => "_",
"</em>" => "_",
"<s>" => "~",
"</s>" => "~",
"<blockquote>" => ">",
"</blockquote>" => ">",
"&" => "&",
"<" => "<",
">" => ">"
)
答案 0 :(得分:2)
#gsub
的工作原理:
replacements = {
"\r\n" => "",
"<p>" => "",
"</p>" => "\n\n",
"<br />" => "\n",
"<strong>" => "*",
"</strong>" => "*",
"<em>" => "_",
"</em>" => "_",
"<s>" => "~",
"</s>" => "~",
"<blockquote>" => ">",
"</blockquote>" => ">",
"&" => "&",
"<" => "<",
">" => ">"
}
a = "<p>text</p> <strong>bold</strong> and <em>italic</em>"
replacements.each do |find, replace|
a.gsub!(find, replace)
end
a # => "text\n\n *bold* and _italic_"
答案 1 :(得分:1)
您可以通过一次调用来完成此操作,正则表达式为/<[^>]+>|[<>&]/
a = "<p>text</p> <strong>bold</strong> and <em>italic</em> & <>"
a.gsub(/(<[^>]+>|[<>&])/, replacements)
# => "text\n\n *bold* and _italic_ & <>"
String#gsub(pattern, hash) → new_str
如果第二个参数是哈希,并且匹配的文本是其键之一,则对应的值是替换字符串。 Docs
正则表达式说明:
<[^>]+>
匹配HTML标记-您首先匹配<
,然后匹配一个或多个不是>
的字符,并依次[^>]+
和>
[<>&]
匹配特殊字符的一次出现,例如<
,>
或&
也就是说,正则表达式不是处理HTML的最佳工具,最好使用HTML解析器(例如Nokogiri)。
答案 2 :(得分:1)
可以一口气完成:
replacements = {
"\r\n" => "",
"<p>" => "",
"</p>" => "\n\n",
"<br />" => "\n",
"<strong>" => "*",
"</strong>" => "*",
"<em>" => "_",
"</em>" => "_",
"<s>" => "~",
"</s>" => "~",
"<blockquote>" => ">",
"</blockquote>" => ">",
"&" => "&",
"<" => "<",
">" => ">"
}
keys = Regexp.union(replacements.keys)
a = "<p>text</p> <strong>bold</strong> and <em>italic</em>"
p a.gsub(keys, replacements) # => "text\n\n *bold* and _italic_"
这很容易工作,因为Regexp.union
为您完成了所有艰苦的工作(转义了奇怪的字符)。