I am using Nokogiri to scan a document and remove specific files that are stored as attachments. I want to note however that the value was removed in-line.
Eg.
<a href="...">File Download</a>
Converted to:
File Removed
Here is what I tried:
@doc = Nokogiri::HTML(html).to_html
@doc.search('a').each do |attachment|
attachment.remove
attachment.content = "REMOVED"
# ALSO TRIED:
attachment.content = "REMOVED"
end
The second one does replace the anchor text but keeps the href and the user can still download the value.
How can I replace the anchor value and change it to a < p> with a new string?
答案 0 :(得分:1)
使用create_element
和replace
的组合来实现这一目标。在下面找到评论。
html = '<a href="...">File Download</a>'
dom = Nokogiri::HTML(html) # parse with nokogiri
dom.to_s # original content
#=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><a href=\"...\">File Download</a></body></html>\n"
# scan the dom for hyperlinks
dom.css('a').each do |a|
node = dom.create_element 'p' # create paragraph element
node.inner_html = "REMOVED" # add content you want
a.replace node # replace found link with paragraph
end
dom.to_s # modified html
#=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><p>REMOVED</p></body></html>\n"
希望这有帮助