简单的正则表达问题

时间:2010-10-16 22:08:43

标签: ruby-on-rails ruby regex

我在博客上有一个标题就像这样Main Idea, key term, key term, keyterm

我希望主要想法和关键术语具有不同的字体大小。首先想到的是搜索第一个逗号和字符串的结尾,并用相同的东西替换该块,但是使用带有类的span标记包围以使字体变小。

这是计划:

HTML(之前)

  <a href="stupidreqexquestion">Main Idea, key term, key term, key term</a>

HTML(之后)

  <a href="stupidreqexquestion">Main Idea <span class="smaller_font">, key term, key term key term</span></a>

我正在使用Rails,因此我计划将其添加为辅助函数 - 例如:

辅助

  def make_key_words_in_title_smaller(title)
      #replace the keywords in the title with key words surrounded by span tags
  end 

查看

  <% @posts.each do |post |%>
      <%= make_key_words_in_title_smaller(post.title)%>
  <% end -%>

2 个答案:

答案 0 :(得分:3)

如果您不关心Main Idea部分是"Welcome home, Roxy Carmichael",也就是说,使用双引号内的comman

>> t = "Main Idea, key term, key term, key term"
=> "Main Idea, key term, key term, key term"

>> t.gsub(/(.*?)(,.*)/, '\1 <span class="smaller_font">\2</span>')
=> "Main Idea <span class=\"smaller_font\">, key term, key term, key term</span>"

答案 1 :(得分:2)

如果字符串是未加修饰的(即没有标签),这些中的任何一个都可以正常工作:

data = 'Main Idea, key term, key term, key term'

# example #1
/^(.+?, )(.+)/.match(data).captures.each_slice(2).map { |a,b| a << %Q{<span class="smaller_font">#{ b }</span>}}.first 
# => "Main Idea, <span class=\"smaller_font\">key term, key term, key term</span>"

# example #2
data =~ /^(.+?, )(.+)/
$1 << %Q{<span class="smaller_font">#{ $2 }</span>} 
# => "Main Idea, <span class=\"smaller_font\">key term, key term, key term</span>"

如果字符串有标签,则不鼓励使用正则表达式来处理HTML或XML,因为它很容易破坏。对您控制的HTML进行极其微不足道的使用非常安全,但如果内容或格式发生变化,则正则表达式可能会破坏您的代码。

HTML解析器是常用的推荐解决方案,因为如果内容或其格式发生更改,它们将继续有效。这就是我使用Nokogiri所做的事情。我故意详细解释发生了什么:

require 'nokogiri'

# build a sample document
html = '<a href="stupidreqexquestion">Main Idea, key term, key term, key term</a>'
doc = Nokogiri::HTML(html) 

puts doc.to_s, ''

# find the link
a_tag = doc.at_css('a[href=stupidreqexquestion]')

# break down the tag content
a_text = a_tag.content
main_idea, key_terms = a_text.split(/,\s+/, 2) # => ["Main Idea", "key term, key term, key term"]
a_tag.content = main_idea

# create a new node
span = Nokogiri::XML::Node.new('span', doc)
span['class'] = 'smaller_font'
span.content = key_terms

puts span.to_s, ''

# add it to the old node
a_tag.add_child(span)

puts doc.to_s
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body><a href="stupidreqexquestion">Main Idea, key term, key term, key term</a></body></html>
# >> 
# >> <span class="smaller_font">key term, key term, key term</span>
# >> 
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body><a href="stupidreqexquestion">Main Idea<span class="smaller_font">key term, key term, key term</span></a></body></html>

在上面的输出中,您可以看到Nokogiri如何构建样本文档,添加的跨度以及生成的文档。

可以简化为:

require 'nokogiri'

doc = Nokogiri::HTML('<a href="stupidreqexquestion">Main Idea, key term, key term, key term</a>')

a_tag = doc.at_css('a[href=stupidreqexquestion]')
main_idea, key_terms = a_tag.content.split(/,\s+/, 2)
a_tag.content = main_idea

a_tag.add_child("<span class='smaller_font'>#{ key_terms }</span>")

puts doc.to_s
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body><a href="stupidreqexquestion">Main Idea<span class="smaller_font">key term, key term, key term</span></a></body></html>