我有一个大字符串,如下所示:
"This is the text and this is a <a href='http://...' class='some_link'> link</a> that I would like to keep, however this is a <a href='http://...'>link with a keyword</a> that I would like to remove"
如果标签内有特殊关键字,我的目标是用'#'替换所有网址。
目前我的行动是:
arr = str.scan(/<a(.*?)a>/)
查找字符串中的所有链接.join
每个arr项目,看看它是includes?
我要找的关键字漫长,复杂,低效。知道如何一次性完成这项操作?
答案 0 :(得分:3)
您可以使用gsub
的阻止版本。它会将每个匹配传递给块,您必须从该块返回替换值。
愚蠢的例子:
replaced = 'Long, complicated, inefficient'.gsub(/\w+/) do |match|
puts "match is: #{match}"
if match.length > 5
'big word'
else
match
end
end
puts replaced
# >> match is: Long
# >> match is: complicated
# >> match is: inefficient
# >> Long, big word, big word
答案 1 :(得分:0)
使用Nokogiri提出解决方案
html_links.xpath('//a').each do |link|
...
end