Question

我有以下YAML文件：

site:
  title: My blog
  domain: example.com
  author1:
    name: bob
    url: /author/bob
  author2:
    name: jane
    url: /author/jane
  header_links:
    about:
      title: About
      url: about.html
    contact:
      title: Contact Us
      url: contactus.html
  js_deps:
    - cashjs
    - jets

products:
  product1:
    name: Prod One
    price: 10
  product2:
    name: Prod Two
    price: 20

我想要一个Bash，Python或AWK函数或脚本，它们可以将上面的YAML文件作为输入（$1，然后生成然后执行以下代码（或完全等效的东西）：

unset site_title 
unset site_domain
unset site_author1
unset site_author2
unset site_header_links
unset site_header_links_about
unset site_header_links_contact
unset js_deps

site_title="My blog"
site_domain="example.com"

declare -A site_author1
declare -A site_author2

site_author1=(
  [name]="bob"
  [url]="/author/bob"
)

site_author2=(
  [name]="jane"
  [url]="/author/jane"
)

declare -A site_header_links_about
declare -A site_header_links_contact

site_header_links_about=(
  [name]="About"
  [url]="about.html"
)

site_header_links_contact=(
  [name]="Contact Us"
  [url]="contact.html"
)

site_header_links=(site_header_links_about  site_header_links_contact)

js_deps=(cashjs jets)

unset products
unset product1
unset product2

declare -A product1
declare -A product2

product1=(
  [name]="Prod One"
  [price]=10
)

product2=(
  [name]="Prod Two"
  [price]=20
)

products=(product1 product2)

因此，逻辑是：

遍历YAML，并在最后一个（底部）级别创建带字符串值的下划线连接变量名称， except ，在该级别应尽可能将数据创建为关联数组或索引数组。 ..同样，任何创建的assoc数组都应在索引数组中按名称列出。

因此，换句话说：

只要可以将最后一级的数据转换为关联数组，则它应该是（foo.bar.hash => ${foo_bar_hash[@]}
只要可以将最后一级的数据转换成索引数组，它就应该是（foo.bar.list => ${foo_bar_list[@]}
每个assoc数组都应在索引数组中按名称列出，该索引数组在yaml数据中以其父项命名（请参见示例中的products）
否则，只需在下划线处连接一个var名称，然后将值保存为字符串（foo.bar.string => ${foo_bar_string}

...之所以需要这种特定的Bash数据结构，是因为我使用的是基于Bash的模板系统。

一旦有了所需的功能，我就可以在模板中轻松使用YAML数据，如下所示：

{{site_title}}

...

{{#foreach link in site_header_links}}
  <a href="{{link.url}}">{{link.name}}</a>
{{/foreach}}

...

{{#js_deps}}
  {{.}}
{{/js_deps}}

...

{{#foreach item in products}}
  {{item.name}}
  {{item.price}}
{{/foreach}}

我尝试过的事情：

这与我之前问过的问题完全相关：

How to convert a subset of YAML into an indexed array of associative arrays?

这是如此接近，但是我还需要生成site_header_links的关联数组才能成功生成 ..它失败了，因为site_header_links嵌套太深了。

我仍然很乐意在解决方案中使用https://github.com/azohra/yaml.sh，因为它也可以为模板系统提供简单的把手样式lookup剥夺技巧：）

编辑：

要非常清楚：解决方案不能使用pip，virtualenv或需要单独安装的任何其他外部dep-它必须是独立的脚本/可以驻留在CMS项目目录中的func（例如https://github.com/azohra/yaml.sh）...否则我就不需要在这里。

...

希望，一个得到很好评论的答案可以帮助我避免回到这里;）

Answer 1

仅凭一眼就很难看出纸牌游戏的规则是什么看着人们玩一轮。并且以类似的方式很难确切了解YAML文件的“规则”是什么。

以下，我也对根级别进行了假设作为第一，第二和第三级节点以及它们的输出生成。对节点进行假设也是有效的根据操作父母的水平，它会更加灵活（如您然后只需添加例如根级别的序列），但这将实施起来有些困难。

保留声明和复合数组分配点缀其他代码并针对“相似”项目进行分组比较麻烦。为此，您需要跟踪节点类型（str， dict，嵌套dict）并在其上进行分组。所以每个根级别的密钥我转储全部 unset首先，然后是所有声明，然后是所有赋值，然后是al 复合作业。我认为这属于“完全符合等效”。

由于products-> product1 / product2被完全处理与site-> author1 / authro2不同，它们具有相同的节点结构，我做了一个单独的函数来处理每个根级别密钥。

要使其运行，您应该为Python（3.7 / 3.6）设置一个虚拟环境，安装 YAML库在其中：

$ python -m venv /opt/util/yaml2bash
$ /opt/util/yaml2bash/bin/pip install ruamel.yaml

然后存储以下程序，例如在/opt/util/yaml2bash/bin/yaml2bash中并使其可执行（chmod +x /opt/util/yaml2bash/bin/yaml2bash）

#! /opt/util/yaml2bash/bin/python

import sys
from pathlib import Path
import ruamel.yaml

if len(sys.argv) > 0:
    input = Path(sys.argv[1])
else:
    input = sys.stdin


def bash_site(k0, v0, fp):
    """this function takes a root-level key and its value (v0 a dict), constructs the 
    list of unsets and outputs based on the keys, values and type of values of v0,
    then dumps these to fp
    """
    unsets = []
    declares = []
    assignments = []
    compounds = {}
    for k1, v1 in v0.items():
        if isinstance(v1, str):
            k = k0 + '_' + k1
            unsets.append(k)
            assignments.append(f'{k}="{v1}"')
        elif isinstance(v1, dict):
            first_val = list(v1.values())[0]
            if isinstance(first_val, str):
                k = k0 + '_' + k1
                unsets.append(k)
                declares.append(k)
                assignments.append(f'{k}=(')
                for k2, v2 in v1.items():
                    q = '"' if isinstance(v2, str) else ''
                    assignments.append(f'  [{k2}]={q}{v2}{q}')
                assignments.append(')')
            elif isinstance(first_val, dict):
                for k2, v2 in v1.items(): # assume all the same type
                    k = k0 + '_' + k1 + '_' + k2   
                    unsets.append(k)
                    declares.append(k)
                    assignments.append(f'{k}=(')
                    for k3, v3 in v2.items():
                        q = '"' if isinstance(v3, str) else ''
                        assignments.append(f'  [{k2}]={q}{v3}{q}')
                    assignments.append(')')
                    compounds.setdefault(k0 + '_' + k1, []).append(k)
            else:
                raise NotImplementedError("unknown val: " + repr(first_val))
        elif isinstance(v1, list):
            unsets.append(k1)
            compounds[k1] = v1
        else:
            raise NotImplementedError("unknown val: " + repr(v1))


    if unsets:
        for item in unsets:
            print('unset', item, file=fp)
        print(file=fp)
    if declares:
        for item in declares:
            print('declare -A', item, file=fp)
        print(file=fp)
    if assignments:
        for item in assignments:
            print(item, file=fp)
        print(file=fp)
    if compounds:
        for k in compounds:
            v = ' '.join(compounds[k])
            print(f'{k}=({v})', file=fp)
        print(file=fp)


def bash_products(k0, v0, fp):
    """this function takes a root-level key and its value (v0 a dict), constructs the 
    list of unsets and outputs based on the keys, values and type of values of v0,
    then dumps these to fp
    """
    unsets = [k0]
    declares = []
    assignments = []
    compounds = {}
    for k1, v1 in v0.items():
        if isinstance(v1, dict):
            first_val = list(v1.values())[0]
            if isinstance(first_val, str):
                unsets.append(k1)
                declares.append(k1)
                assignments.append(f'{k1}=(')
                for k2, v2 in v1.items():
                    q = '"' if isinstance(v2, str) else ''
                    assignments.append(f'  [{k2}]={q}{v2}{q}')
                assignments.append(')')
                compounds.setdefault(k0, []).append(k1)
            else:
                raise NotImplementedError("unknown val: " + repr(first_val))
        else:
            raise NotImplementedError("unknown val: " + repr(v1))


    if unsets:
        for item in unsets:
            print('unset', item, file=fp)
        print(file=fp)
    if declares:
        for item in declares:
            print('declare -A', item, file=fp)
        print(file=fp)
    if assignments:
        for item in assignments:
            print(item, file=fp)
        print(file=fp)
    if compounds:
        for k in compounds:
            v = ' '.join(compounds[k])
            print(f'{k}=({v})', file=fp)
        print(file=fp)




yaml = ruamel.yaml.YAML()
data = yaml.load(input)

output = sys.stdout  # make it easier to redirect to file if necessary at some point in the future

bash_site('site', data['site'], output)
bash_products('products', data['products'], output)

如果您运行此程序并将YAML输入文件作为参数（/opt/util/yaml2bash/bin/yaml2bash input.yaml）给出：

unset site_title
unset site_domain
unset site_author1
unset site_author2
unset site_header_links_about
unset site_header_links_contact
unset js_deps

declare -A site_author1
declare -A site_author2
declare -A site_header_links_about
declare -A site_header_links_contact

site_title="My blog"
site_domain="example.com"
site_author1=(
  [name]="bob"
  [url]="/author/bob"
)
site_author2=(
  [name]="jane"
  [url]="/author/jane"
)
site_header_links_about=(
  [about]="About"
  [about]="about.html"
)
site_header_links_contact=(
  [contact]="Contact Us"
  [contact]="contactus.html"
)

site_header_links=(site_header_links_about site_header_links_contact)
js_deps=(cashjs jets)

unset products
unset product1
unset product2

declare -A product1
declare -A product2

product1=(
  [name]="Prod One"
  [price]=10
)
product2=(
  [name]="Prod Two"
  [price]=20
)

products=(product1 product2)

您可以使用类似source $(/opt/util/yaml2bash/bin/yaml2bash input.yaml)的方法在bash中获取所有这些值。

请注意，YAML文件中的全部双引号是多余的。

使用Python和ruamel.yaml（免责声明，我是该书的作者软件包）为您提供完整的YAML解析器，例如允许您使用注释和流程样式集合：

jsdeps: [cashjs, jets]    # more compact

如果您几乎要停产Python 2.7，并且无法完全控制计算机（在这种情况下，您应该为其安装/编译Python 3.7），则仍然可以使用ruamel yaml。 / p>

确定程序的运行位置，例如~/bin
创建~/bin/ruamel（根据1进行调整）。
cd ~/bin/ruamel
touch __init__.py
从PyPI下载latest tar file
解压缩tar文件并将结果目录从ruamel.yaml-X.Y.Z重命名为yaml

ruamel.yaml应该没有依赖关系地工作。在2.7上是ruamel.ordereddict和ruamel.yaml.clib，它们提供C版本的基本例程来加快速度。

上述程序需要重新编写一些内容（f字符串-> "".format()和pathlib.Path->老式with open(...) as fp:

Answer 2

我决定将以下各项组合使用：

Yay的被黑版本：
- 增加了对简单列表的支持
- 修复多个缩进级别
this yaml parser的被黑版本：
- 具有从Yay借来的前缀内容，以保持一致性

override func viewDidLoad() {
    let width = self.bounds.width // This is the width of the superview, in your case probably the `UIViewController`
    let height = 70 // Your desired height, if you want it to full the superview, use self.bounds.height
    let layout = collectionView.collectionViewLayout as! UICollectionViewFlowLayout
    layout.itemSize = CGSize(width: width, height: self.bounds.height) // Sets the dimensions of your collection view cell. 
}

当activities: Activity[]; notifications: any[] = []; this.profileService .listProfileActivities(this.authService.profileId) .subscribe({ next: activities => { this.activities = activities.filter( activity => activity.type === 'favorite' && activity.to === this.authService.profileId ); this.activities.forEach(activity => { // forEach loop is here const notification = { profile: null, profileId: '', imageId: '', name: '', timeago: new Date(), }; this.profileService .readProfile(activity.from) // 2nd subscribe method dependent on forEach loop variable .subscribe(profile => { notification.profile = profile; notification.profileId = profile.id; notification.imageId = profile.imageID; notification.name = profile.name; notification.timeago = new Date(activity.at); }); this.notifications.push(notification); }); }, });包含以下内容时，使用上面的代码

：

function yaml_to_vars {
   # find input file
   for f in "$1" "$1.yay" "$1.yml"
   do
     [[ -f "$f" ]] && input="$f" && break
   done
   [[ -z "$input" ]] && exit 1

   # use given dataset prefix or imply from file name
   [[ -n "$2" ]] && local prefix="$2" || {
     local prefix=$(basename "$input"); prefix=${prefix%.*}; prefix="${prefix//-/_}_";
   }

   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
      if(length($2)== 0){  vname[indent]= ++idx[indent] };
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], $3);
      }
   }'
}

yay_parse() {

   # find input file
   for f in "$1" "$1.yay" "$1.yml"
   do
     [[ -f "$f" ]] && input="$f" && break
   done
   [[ -z "$input" ]] && exit 1

   # use given dataset prefix or imply from file name
   [[ -n "$2" ]] && local prefix="$2" || {
     local prefix=$(basename "$input"); prefix=${prefix%.*}; prefix=${prefix//-/_};
   }

   echo "unset $prefix; declare -g -a $prefix;"

   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   #sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \
   #       -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" "$input" |
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent       = length($1)/2;
      key          = $2;
      value        = $3;

      # No prefix or parent for the top level (indent zero)
      root_prefix  = "'$prefix'_";
      if (indent == 0) {
        prefix = "";          parent_key = "'$prefix'";
      } else {
        prefix = root_prefix; parent_key = keys[indent-1];
      }

      keys[indent] = key;

      # remove keys left behind if prior row was indented more than this row
      for (i in keys) {if (i > indent) {delete keys[i]}}

      # if we have a value
      if (length(value) > 0) {

        # set values here

        # if the "key" is missing, make array indexed, not assoc..

        if (length(key) == 0) {
          # array item has no key, only a value..
          # so, if we didnt already unset the assoc array
          if (unsetArray == 0) {
            # unset the assoc array here
            printf("unset %s%s; ", prefix, parent_key);
            # switch the flag, so we only unset once, before adding values
            unsetArray = 1;
          }
          # array was unset, has no key, so add item using indexed array syntax
          printf("%s%s+=(\"%s\");\n", prefix, parent_key, value);

        } else {
          # array item has key and value, add item using assoc array syntax
          printf("%s%s[%s]=\"%s\";\n", prefix, parent_key, key, value);
        }

      } else {

        # declare arrays here

        # reset this flag for each new array we work on...
        unsetArray = 0;

        # if item has no key, declare indexed array
        if (length(key) == 0) {
          # indexed
          printf("unset %s%s; declare -g -a %s%s;\n", root_prefix, key, root_prefix, key);

        # if item has numeric key, declare indexed array
        } else if (key ~ /^[[:digit:]]/) {
          printf("unset %s%s; declare -g -a %s%s;\n", root_prefix, key, root_prefix, key);

        # else (item has a string for a key), declare associative array
        } else {
          printf("unset %s%s; declare -g -A %s%s;\n", root_prefix, key, root_prefix, key);
        }

        # set root level values here

        if (indent > 0) {
          # add to associative array
          printf("%s%s[%s]+=\"%s%s\";\n", prefix, parent_key , key, root_prefix, key);
        } else {
          # add to indexed array
          printf("%s%s+=( \"%s%s\");\n", prefix, parent_key , root_prefix, key);
        }

      }
   }'
}

# helper to load yay data file
yay() {
  # yaml_to_vars "$@"  ## uncomment to debug (prints data to stdout)
  eval $(yaml_to_vars "$@")

  # yay_parse "$@"  ## uncomment to debug (prints data to stdout)
  eval $(yay_parse "$@")
}

解析器可以这样调用：

products.yml

它生成并评估以下代码：

  product1
    name: Foo
    price: 100
  product2
    name: Bar
    price: 200

因此，我得到了以下Bash数组和变量：

source path/to/yml-parser.sh
yay products.yml

在我的模板系统中，我现在可以像这样访问yml数据：

products_product1_name="Foo"
products_product1_price="100"
products_product2_name="Bar"
products_product2_price="200"
unset products;
declare -g -a products;
unset products_product1;
declare -g -A products_product1;
products+=( "products_product1");
products_product1[name]="Foo";
products_product1[price]="100";
unset products_product2;
declare -g -A products_product2;
products+=( "products_product2");
products_product2[name]="Bar";
products_product2[price]="200";

：）

另一个例子：

文件declare -a products=([0]="products_product1" [1]="products_product2") declare -A products_product1=([price]="100" [name]="Foo" ) declare -A products_product2=([price]="200" [name]="Bar" )

{{#foreach product in products}}
  Name:  {{product.name}}
  Price: {{product.price}}
{{/foreach}}

产生：

site.yml

在模板中，我可以像这样访问meta_info: title: My cool blog domain: foo.github.io author1: name: bob url: /author/bob author2: name: jane url: /author/jane header_links: link1: title: About url: about.html link2: title: Contact Us url: contactus.html js_deps: cashjs: cashjs jets: jets Foo: - one - two - three：

declare -a site=([0]="site_meta_info" [1]="site_author1" [2]="site_author2" [3]="site_header_links" [4]="site_js_deps" [5]="site_Foo")
declare -A site_meta_info=([title]="My cool blog" [domain]="foo.github.io" )
declare -A site_author1=([url]="/author/bob" [name]="bob" )
declare -A site_author2=([url]="/author/jane" [name]="jane" )
declare -A site_header_links=([link1]="site_link1" [link2]="site_link2" )
declare -A site_link1=([url]="about.html" [title]="About" )
declare -A site_link2=([url]="contactus.html" [title]="Contact Us" )
declare -A site_js_deps=([cashjs]="cashjs" [jets]="jets" )
declare -a site_Foo=([0]="one" [1]="two" [2]="three")

和site_header_links（破折号或简单列表），如下所示：

{{#foreach link in site_header_links}}
  * {{link.title}} - {{link.url}}
{{/foreach}}

如何将YAML数据解析为自定义Bash数据数组/哈希结构？

我尝试过的事情：

2 个答案:

另一个例子：