解析YAML时忽略日期和时间?

时间:2016-01-07 23:33:29

标签: python yaml

我正在编写一个脚本,将一系列YAML文件转换为单个JSON blob。我有一个像这样的YAML文件:

---
AWSTemplateFormatVersion: 2010-09-09
Description: AWS CloudFormation ECS Sample
Parameters:
    - SolrCloudInstanceType:
        Type: String
        Description: Solr Cloud EC2 Instance Type
        Default: m3.2xlarge
Resources:
    - ContainerInstance:
        Type: AWS::EC2::Instance
        Properties:
            InstanceType: m3.xlarge

我正在加载它

import yaml

with open('base.yml', 'rb') as f:
    result = yaml.safe_load(f)

有趣的是,如果我检查AWSTemplateFormatVersion,我会得到一个Python datetime.date对象。这会导致我的JSON输出失败:

>>> json.dump(result, sys.stdout, sort_keys=True, indent=4)
{
    "AWSTemplateFormatVersion": Traceback (most recent call last):
  File "./c12n-assemble", line 42, in <module>
    __main__()
  File "./c12n-assemble", line 25, in __main__
    assembler.assemble()
  File "./c12n-assemble", line 39, in assemble
    json.dump(self.__result, self.__output_file, sort_keys=True, indent=4, separators=(',', ': '))
  File "/usr/lib/python2.7/json/__init__.py", line 189, in dump
    for chunk in iterable:
  File "/usr/lib/python2.7/json/encoder.py", line 434, in _iterencode
    for chunk in _iterencode_dict(o, _current_indent_level):
  File "/usr/lib/python2.7/json/encoder.py", line 408, in _iterencode_dict
    for chunk in chunks:
  File "/usr/lib/python2.7/json/encoder.py", line 442, in _iterencode
    o = _default(o)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: datetime.date(2010, 9, 9) is not JSON serializable

有没有办法强制YAML解析器不是&#34; smart&#34;关于它认为日期或日期+时间只是解析一个字符串?

2 个答案:

答案 0 :(得分:5)

您可以扩展PyYAML加载器并删除时间戳或其他类型的隐式标记,如下所示:

class NoDatesSafeLoader(yaml.SafeLoader):
    @classmethod
    def remove_implicit_resolver(cls, tag_to_remove):
        """
        Remove implicit resolvers for a particular tag

        Takes care not to modify resolvers in super classes.

        We want to load datetimes as strings, not dates, because we
        go on to serialise as json which doesn't have the advanced types
        of yaml, and leads to incompatibilities down the track.
        """
        if not 'yaml_implicit_resolvers' in cls.__dict__:
            cls.yaml_implicit_resolvers = cls.yaml_implicit_resolvers.copy()

        for first_letter, mappings in cls.yaml_implicit_resolvers.items():
            cls.yaml_implicit_resolvers[first_letter] = [(tag, regexp) 
                                                         for tag, regexp in mappings
                                                         if tag != tag_to_remove]

NoDatesSafeLoader.remove_implicit_resolver('tag:yaml.org,2002:timestamp')

使用此备用加载程序,如下所示:

>>> yaml.load("2015-03-22 01:49:21", Loader=NoDatesSafeLoader)
'2015-03-22 01:49:21'

作为参考,原始行为是:

>>> yaml.load("2015-03-22 01:49:21")
datetime.datetime(2015, 3, 22, 1, 49, 21)

答案 1 :(得分:1)

接受的答案方法非常适合基于pyyaml的库。实际上,它应该是pyyaml的BaseResolver类本身的一部分。但是,为了更快,更kludgier地移除特定的解析器:

NoDatesSafeLoader = yaml.SafeLoader
NoDatesSafeLoader.yaml_implicit_resolvers = {
    k: [r for r in v if r[0] != 'tag:yaml.org,2002:timestamp'] for
        k, v in NoDatesSafeLoader.yaml_implicit_resolvers.items()
}

然后:

>>> yaml.load("2015-03-22 01:49:21", Loader=NoDatesSafeLoader)
'2015-03-22 01:49:21'