通过python脚本在yaml中保存订单

时间:2017-07-22 10:46:57

标签: python yaml

我正在编写python脚本,可以自动完成我的yaml工作。我将从不同的csv文件创建一个yaml结构。但目前我试图通过例子来理解yaml strucutre。我正在查看一些yaml教程和示例,我遇到了一个问题,无法正确解决

我的python代码如上面的结构

import sys
import yaml
from collections import OrderedDict

d = {'version': '22-07-2017', 'description': 'energie balance',
     'info': {
         'principalInvestigator': 'Kalthoff',
         'personInCharge': 'Scheer'
     },
     'dataSources': 'null',
     'devices': {
       'type': 'HMP',
       'info': {
           'description': 'temperature and humidity sensor',
           'company': 'Vaisala',
           'model': 'HMP35A',
           },
       'measures': {
           'quantity': 'T',
           'annotations': 'air',
           'sensors': {
               'number': '001',
               'sources': {
                   'id': 'null',
                   'frequency': '0.1',
                   'aggregation': 'AVG',
                   'field': 'null'
                   }
               }

           }
       }
     }
with open('/home/ali/Desktop/yaml-conf-task/result.yml', 'w') as yaml_file:
yaml.dump(d, yaml_file,  default_flow_style=False)

但是当我打开yaml文件时,它会给我无序的数据。我收到了这个

dataSources: 'null'
description: energie balance
devices:
  info:
    company: Vaisala
    description: temperature and humidity sensor
    model: HMP35A
  measures:
    annotations: air
    quantity: T
    sensors:
      number: '001'
      sources:
        aggregation: AVG
        field: 'null'
        frequency: '0.1'
        id: 'null'
  type: HMP
info:
  personInCharge: Scheer
  principalInvestigator: Kalthoff
version: 22-07-2017

而不是得到这个

version: 21-07-2017
description: energie balance
info:
  principalInvestigator: rob
  personInCharge: rio
dataSources: null
devices:
  - type: TMP
    info:
      description: temperature and humidity sensor
      company: Vio
      model: 35A
    measures:
      - quantity: T
        annotation: air
        sensors:
          - number: 001
            sources:
              - id: null
                frequency: 1
                aggregation: AVG
                field: null

如果有人建议我如何维持秩序,我将不胜感激。我查看堆栈溢出,但无法解决我的问题。

1 个答案:

答案 0 :(得分:1)

首先,从技术上讲,YAML是JSON的超集,因此,根据规范,不保证映射集的顺序。因此,您尝试实现的目标并不是您能够在任何地方重现的东西,除非您控制完整的数据流,否则您可能会遇到问题。

另外,正如我在评论中所说,Python自己的dict通常不是保持顺序的,但Python有collections.OrderedDict,您可以重新声明您的结构以保留顺序为:

from collections import OrderedDict

d = OrderedDict([('version', '22-07-2017'), ('description', 'energie balance'),
                 ('info', OrderedDict([
                     ('principalInvestigator', 'Kalthoff'),
                     ('personInCharge', 'Scheer')
                 ])),
                 ('dataSources', 'null'),
                 ('devices', OrderedDict([
                     ('type', 'HMP'),
                     ('info', OrderedDict([
                         ('description', 'temperature and humidity sensor'),
                         ('company', 'Vaisala'),
                         ('model', 'HMP35A')
                     ])),
                     ('measures', OrderedDict([
                         ('quantity', 'T'),
                         ('annotations', 'air'),
                         ('sensors', OrderedDict([
                             ('number', '001'),
                             ('sources', OrderedDict([
                                 ('id', 'null'),
                                 ('frequency', '0.1'),
                                 ('aggregation', 'AVG'),
                                 ('field', 'null')
                             ]))
                         ]))
                     ]))
                 ]))
                 ])

是的,它比干净的dict结构更糟糕,因为你必须使用嵌套列表/元组来保存顺序,但是一旦你习惯它就不那么困难 - 你只需要更换所有dict声明都包含OrderedDict([]),所有key: value声明都包含(key, value)

但这只是方程式的一部分 - 一旦你有一个类似dict的结构保持其顺序,你的YAML序列化器也应该意识到它。如果你只是通过一个通用的YAML序列化器(假设PyYAML)转储上面的结构,你会得到:

!!python/object/apply:collections.OrderedDict
- - [version, 22-07-2017]
  - [description, energie balance]
  - - info
    - !!python/object/apply:collections.OrderedDict
      - - [principalInvestigator, Kalthoff]
        - [personInCharge, Scheer]
  - [dataSources, 'null']
  - - devices
    - !!python/object/apply:collections.OrderedDict
      - - [type, HMP]
        - - info
          - !!python/object/apply:collections.OrderedDict
            - - [description, temperature and humidity sensor]
              - [company, Vaisala]
              - [model, HMP35A]
        - - measures
          - !!python/object/apply:collections.OrderedDict
            - - [quantity, T]
              - [annotations, air]
              - - sensors
                - !!python/object/apply:collections.OrderedDict
                  - - [number, '001']
                    - - sources
                      - !!python/object/apply:collections.OrderedDict
                        - - [id, 'null']
                          - [frequency, '0.1']
                          - [aggregation, AVG]
                          - [field, 'null']

当然,它保留了订单,但它导出了实际的内部collections.OrderedDict结构,允许您将其加载回相同的结构,而这不是您想要的。相反,您需要告诉它将您的OrderedDict视为常规映射集,因此:

import yaml

def ordered_dict_representer(self, value):  # can be a lambda if that's what you prefer
    return self.represent_mapping('tag:yaml.org,2002:map', value.items())
yaml.add_representer(OrderedDict, ordered_dict_representer)

现在如果您将其导出为:

with open('/home/ali/Desktop/yaml-conf-task/result.yml', 'w') as yaml_file:
    yaml.dump(d, yaml_file,  default_flow_style=False)

你会得到:

version: 22-07-2017
description: energie balance
info:
  principalInvestigator: Kalthoff
  personInCharge: Scheer
dataSources: 'null'
devices:
  type: HMP
  info:
    description: temperature and humidity sensor
    company: Vaisala
    model: HMP35A
  measures:
    quantity: T
    annotations: air
    sensors:
      number: '001'
      sources:
        id: 'null'
        frequency: '0.1'
        aggregation: AVG
        field: 'null'