如何从img标签中提取所有属性

时间:2017-01-01 14:09:41

标签: ruby-on-rails ruby parsing nokogiri

我试图使用Nokogiri转向:

<img class="img-responsive" src="img/logologo.png" alt=""> 

为:

<%= image_tag('img/logologo.png', :class => 'img-responsive', :alt => '') %>

这是我的代码:

# a = <img class="img-responsive" src="img/logologo.png" alt="" width="256" height="256"> 
page = Nokogiri::HTML(a)
img = page.css('img')[0]
src =  ""
alt =  ""
class_atr = ""
src =  img['src'] if img['src'].present?
alt =  img['alt'] if img['alt'].present?
class_atr = img['class'] if img['class'].present?
result = "<%= image_tag(\'" + src + '\', :class => \'' + class_atr + '\', :alt => \'' + alt + '\')%>'

这有点像硬代码,有没有办法可以提取所有属性及其src?

图片代码可能包含heightwidth个参数。如何自动提取所有属性并将其转换为ERB?

2 个答案:

答案 0 :(得分:0)

使用以下代码迭代HTML标记内的所有int main(int argc, char** argv) { // See if we've been given a seed to use (for testing purposes). When you // specify a random seed, the evolution will be exactly the same each time // you use that seed number. unsigned int seed = 0; for(int ii=1; ii<argc; ii++) { if(strcmp(argv[ii++],"seed") == 0) { seed = atoi(argv[ii]); } } 标记并获取其属性:

<img>

答案 1 :(得分:0)

好的,有很多事情要做。让我们从如何解析HTML开始。如果您正在解析的是片段或单个标记,则可以使用DocumentFragment告诉Nokogiri不添加常用的HTML标记:

doc = Nokogiri::HTML::DocumentFragment.parse('<img class="img-responsive" src="img/logologo.png" alt="">')
doc.to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">"

相反,您可以这样做:

css

接下来,如果您的意思是xpathsearchat,请不要使用at_cssat_xpathdoc.css('img').class # => Nokogiri::XML::NodeSet doc.at('img').class # => Nokogiri::XML::Element doc.css('img')[0].to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">" doc.css('img').first.to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">" doc.at('img').to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">" 。冥想:

css

xpathsearchat返回NodeSet非常重要,值得记住。 first及其变体等同于在返回的NodeSet上使用[0]at,返回第一个节点,因此请使用require 'nokogiri' doc = Nokogiri::HTML::DocumentFragment.parse('<img class="img-responsive" src="img/logologo.png" alt="">') img = doc.at('img') img_src = img.delete('src') img_params = img.map { |p, v| ":%s => '%s'" % [p, v] }.join(', ') # => ":class => 'img-responsive', :alt => ''" img_template = "<%%= image_tag('%s', %s) %%>" % [img_src, img_params] # => "<%= image_tag('img/logologo.png', :class => 'img-responsive', :alt => '') %>" 和朋友,如果那是你的意思意思是因为它导致代码不那么嘈杂。

以下是我的观点:

:k => "v"

当然,使用img_params = img.map { |p, v| "%s: '%s'" % [p, v] }.join(', ') # => "class: 'img-responsive', alt: ''" 格式的键/值是老派。我建议改为:

"<%= image_tag('img/logologo.png', class: 'img-responsive', alt: '') %>"

导致:

guard