Rails:仅在单词之间拆分HTML字符串

时间:2014-09-30 02:51:11

标签: html ruby-on-rails regex ruby-on-rails-3

鉴于此变量:

=> str = " and then there was a gigantic <a href="link.com/bug.jpg">bug</a> on her nose!"

如何编写一个函数,而不是在达到字符限制的任何地方破坏:

=> str[0..33] = " and then there was a gigantic <a "

我有一些可以很好地使用HTML的东西,并且如果打开了一个标签,则会返回结束标记:

=> some_function(str) = " and then there was a gigantic <a href="link.com/bug.jpg">bug</a>"

我甚至会满足于让事情变得更糟的事情,比如:

=> worse_function(str) = " and then there was a gigantic"

任何帮助都会很棒。显然,它必须有一个粗略的字符限制或字数限制。

更新

到目前为止,我有这个:

def friendly_excerpt(string, length)
  excerpt = string.split[0..length].to_s
  if excerpt.include?('<') && !excerpt.include?('>')
    friendly_excerpt = excerpt.slice(0..(excerpt.index('<')))
  end
  friendly_excerpt
end

4 个答案:

答案 0 :(得分:1)

我愿意:

  1. Couont字符串
  2. 中有多少<
  3. 检查所有<
  4. 的索引
  5. 将标记从<移至>
  6. 所以它会是这样的:

    def remove_html_tag(str)
      result = str
      tag_count = str.count('<')
    
      for i in 0..tag_count do
        index_1 = result.index('<')
        index_2 = result.index('>')
        result = result[0...index_1] + result[index_2..-1] 
        # the above line remove one html <> tag, and it repeats
      end
    
      result
    end
    

答案 1 :(得分:1)

我有这个解决方案:

def friendly_excerpt(string, length)
  excerpt = string.split[0..length].join(' ')
  if excerpt.include?('<') && !excerpt.include?('>')
    friendly_excerpt = excerpt.slice(0..(excerpt.index('<') - 1)).strip
  else
    friendly_excerpt = excerpt.strip
  end
  friendly_excerpt
end

似乎工作得很好。

答案 2 :(得分:0)

如果您的目标是清除截断包含HTML的字符串,而不是自己编写函数,我建议使用gem html_truncator。它使用Nokogiri来解析HTML,然后适当地处理截断。

示例(GitHub page上的更多内容):

HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
# => "<p>Lorem ipsum dolor…</p>"

请注意,它默认采用 words 中的截断长度参数而不是字符,但是可以选择使用字符。

HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 12, :length_in_chars => true)
# => "<p>Lorem ipsum…</p>"

答案 3 :(得分:0)

我看到HTML的分钟,我转向Nokogiri,因为我无法处理开始和结束HTML元素。我已经尝试过很多次了。假设你安装了Nokogiri ......

html_string = ' and then there was a gigantic <a href="link.com/bug.jpg">bug</a> on her nose!'
min_length = 33
res = Nokogiri.HTML(html_string)
nodes = res.elements.children.children.children #I wish I knew why all of these are needed.
nodes.reduce('') { |new_string, node| 
   break new_string if new_string.length > min_length; 
   new_string + node.to_html 
}