替换<a> with Nokogiri

时间:2018-06-18 15:56:18

标签: html ruby replace nokogiri

I am using Nokogiri to scan a document and remove specific files that are stored as attachments. I want to note however that the value was removed in-line.

Eg.

<a href="...">File Download</a>

Converted to:

File Removed

Here is what I tried:

@doc = Nokogiri::HTML(html).to_html
@doc.search('a').each do |attachment|
    attachment.remove
    attachment.content = "REMOVED"

    # ALSO TRIED:
    attachment.content = "REMOVED"
end

The second one does replace the anchor text but keeps the href and the user can still download the value.

How can I replace the anchor value and change it to a < p> with a new string?

1 个答案:

答案 0 :(得分:1)

使用create_elementreplace的组合来实现这一目标。在下面找到评论。

html = '<a href="...">File Download</a>'
dom = Nokogiri::HTML(html) # parse with nokogiri
dom.to_s # original content
#=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><a href=\"...\">File Download</a></body></html>\n"

# scan the dom for hyperlinks
dom.css('a').each do |a|
  node = dom.create_element 'p' # create paragraph element
  node.inner_html = "REMOVED" # add content you want
  a.replace node # replace found link with paragraph
end
dom.to_s # modified html
#=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><p>REMOVED</p></body></html>\n"

希望这有帮助