如何将字符串读入Ruby字典?

时间:2013-11-30 03:18:42

标签: ruby jekyll liquid

我目前想用Jekyll-Blog取代我的Wordpress-Blog。为此,我必须找到WordPress标题标记的替代方法:

[caption id="attachment_76716" align="aligncenter" width="500"]<a href="http://martin-thoma.com/wp-content/uploads/2013/11/WER-calculation.png"><img src="http://martin-thoma.com/wp-content/uploads/2013/11/WER-calculation.png" alt="WER calculation" width="500" height="494" class="size-full wp-image-76716" /></a> WER calculation[/caption]

我认为如果我能在帖子中使用它们会很好:

{% caption align="aligncenter" width="500" alt="WER calculation" text="WER calculation" url="../images/2013/11/WER-calculation.png" %}

虽然应该渲染到:

<div style="width: 510px" class="wp-caption aligncenter">
    <a href="../images/2013/11/WER-calculation.png">
        <img src="../images/2013/11/WER-calculation.png" alt="WER calculation" width="500" height="494" class="size-full">
    </a>
    <p class="wp-caption-text">WER calculation</p>
</div>

所以我编写了一些python代码进行替换(一次),我想编写一个Ruby / Liquid / Jekyll插件来进行渲染。但我不知道如何阅读

align="aligncenter" width="500" alt="WER calculation" text="WER calculation" url="../images/2013/11/WER-calculation.png"

进入红宝石词典(它们似乎被称为“哈希”?)。

这是我的插件:

# Title: Caption tag
# Author: Martin Thoma, http://martin-thoma.com

module Jekyll
  class CaptionTag < Liquid::Tag

    def initialize(tag_name, text, tokens)
      super
      @text = text
      @tokens = tokens
    end

    def render(context)
        @hash = Hash.new
        @array = @text.split(" ")
        @array.each do |element|
            key, value = element.split("=")
            @hash[key] = value
        end
        #"#{@text} #{@tokens}"
        "<div style=\"width: #{@hash['width']}px\" class=\"#{@hash['alignment']}\">" +
        "<a href=\"../images/#{@hash['url']}\">" +
            "<img src=\"../images/#{@hash['url']}\" alt=\"#{@hash['text']}\" width=\"#{@hash['width']}\" height=\"#{@hash['height']}\" class=\"#{@hash['class']}\">" +
        "</a>" +
        "<p class=\"wp-caption-text\">#{@hash['text']}</p>" +
        "</div>"
    end
  end
end

Liquid::Template.register_tag('caption', Jekyll::CaptionTag)

在Python中,我会使用CSV module并将分隔符设置为space而quotechar为“。但我是Ruby的新手。

我刚刚看到Ruby也有一个CSV模块。但它不起作用,因为引用不正确。所以我需要一些html解析。

Python解决方案

def parse(text):
    splitpoints = []

    # parse
    isOpen = False
    for i, char in enumerate(text):
        if char == '"':
            isOpen = not isOpen
        if char == " " and not isOpen:
            splitpoints.append(i)

    # build data structure
    dictionary = {}
    last = 0
    for i in splitpoints:
        key, value = text[last:i].split('=')
        last = i+1
        dictionary[key] = value[1:-1] # remove delimiter
    return dictionary

print(parse('align="aligncenter" width="500" alt="WER calculation" text="WER calculation" url="../images/2013/11/WER-calculation.png"'))

1 个答案:

答案 0 :(得分:0)

如果将行分隔符设置为空格,将列分隔符设置为'='并将char引用为'“',则可以使用Ruby的CSV类轻松地将字符串解析为Hash:

require 'csv'

def parse_attrs(input)
  options = { col_sep: '=', row_sep: ' ', quote_char: '"' }
  csv = CSV.new input, options

  csv.each_with_object({}) do |row, attrs|
    attr, value = row
    value ||= true
    attrs[attr] = value
  end
end

示例:

irb(main):031:0> input = 'align="aligncenter" width="500" alt="WER calculation" text="WER calculation" url="../images/2013/11/WER-calculation.png" required'
=> "align=\"aligncenter\" width=\"500\" alt=\"WER calculation\" text=\"WER calculation\" url=\"../images/2013/11/WER-calculation.png\""
irb(main):032:0> parse_attrs input
=> {"align"=>"aligncenter", "width"=>"500", "alt"=>"WER calculation", "text"=>"WER calculation", "url"=>"../images/2013/11/WER-calculation.png"}