看起来好像PyYAML将字符串10:01解释为以秒为单位的持续时间:
import yaml
>>> yaml.load("time: 10:01")
{'time': 601}
官方文档没有反映出:PyYAML documentation
有关如何将10:01作为字符串阅读的任何建议吗?
答案 0 :(得分:4)
把它放在引号中:
>>> import yaml
>>> yaml.load('time: "10:01"')
{'time': '10:01'}
这告诉YAML它是一个文字字符串,并禁止尝试将其视为数值。
答案 1 :(得分:1)
由于您正在为YAML 1.1使用解析器,因此您应该期望实现specification(示例2.19)中指示的内容:
sexagesimal: 3:25:45
进一步解释了性别动物here:
使用“:”允许在基数60中表示整数,这对于时间和角度值是方便的。
并非PyYAML中实现的每个细节都在您引用的文档中,您只应将其视为简介。
你不是唯一一个发现这种解释令人困惑的人,而在YAML 1.2中,性别动词从specification中删除。尽管该规范已经出现了大约八年,但这些变化从未在PyYAML中实现过。
解决此问题的最简单方法是升级到ruamel.yaml(免责声明:我是该软件包的作者),您将获得YAML 1.2行为(除非您明确指定要使用YAML 1.1)将10:01
解释为字符串:
from ruamel import yaml
import warnings
warnings.simplefilter('ignore', yaml.error.UnsafeLoaderWarning)
data = yaml.load("time: 10:01")
print(data)
给出:
{'time': '10:01'}
仅需要使用warnings.filter,因为您使用的是.load()
而不是.safe_load()
。前者是不安全并且可能导致擦除磁盘,或者更糟糕的是,当用于不受控制的YAML输入时。很少有理由不使用.safe_load()
。
答案 2 :(得分:0)
如果您希望monkeypatch pyyaml库,因此它没有这种行为(因为没有简洁的方法可以做到这一点),对于您选择的解析器,下面的代码可以工作。问题是the regex that is used for int
includes some code to match timestamps即使看起来没有这种行为的规范,它只是被认为是一种良好的做法"将30:00
或40:11:11:11:11
等字符串视为整数。
import yaml
import re
def partition_list(somelist, predicate):
truelist = []
falselist = []
for item in somelist:
if predicate(item):
truelist.append(item)
else:
falselist.append(item)
return truelist, falselist
@classmethod
def init_implicit_resolvers(cls):
"""
creates own copy of yaml_implicit_resolvers from superclass
code taken from add_implicit_resolvers; this should be refactored elsewhere
"""
if not 'yaml_implicit_resolvers' in cls.__dict__:
implicit_resolvers = {}
for key in cls.yaml_implicit_resolvers:
implicit_resolvers[key] = cls.yaml_implicit_resolvers[key][:]
cls.yaml_implicit_resolvers = implicit_resolvers
@classmethod
def remove_implicit_resolver(cls, tag, verbose=False):
cls.init_implicit_resolvers()
removed = {}
for key in cls.yaml_implicit_resolvers:
v = cls.yaml_implicit_resolvers[key]
vremoved, v2 = partition_list(v, lambda x: x[0] == tag)
if vremoved:
cls.yaml_implicit_resolvers[key] = v2
removed[key] = vremoved
return removed
@classmethod
def _monkeypatch_fix_int_no_timestamp(cls):
bad = '|[-+]?[1-9][0-9_]*(?::[0-5]?[0-9])+'
for key in cls.yaml_implicit_resolvers:
v = cls.yaml_implicit_resolvers[key]
vcopy = v[:]
n = 0
for k in xrange(len(v)):
if v[k][0] == 'tag:yaml.org,2002:int' and bad in v[k][1].pattern:
n += 1
p = v[k][1]
p2 = re.compile(p.pattern.replace(bad,''), p.flags)
vcopy[k] = (v[k][0], p2)
if n > 0:
cls.yaml_implicit_resolvers[key] = vcopy
yaml.resolver.Resolver.init_implicit_resolvers = init_implicit_resolvers
yaml.resolver.Resolver.remove_implicit_resolver = remove_implicit_resolver
yaml.resolver.Resolver._monkeypatch_fix_int_no_timestamp = _monkeypatch_fix_int_no_timestamp
然后,如果你这样做:
class MyResolver(yaml.resolver.Resolver):
pass
t1 = MyResolver.remove_implicit_resolver('tag:yaml.org,2002:timestamp')
MyResolver._monkeypatch_fix_int_no_timestamp()
class MyLoader(yaml.SafeLoader, MyResolver):
pass
text = '''
a: 3
b: 30:00
c: 30z
d: 40:11:11:11
'''
print yaml.safe_load(text)
print yaml.load(text, Loader=MyLoader)
然后打印
{'a': 3, 'c': '30z', 'b': 1800, 'd': 8680271}
{'a': 3, 'c': '30z', 'b': '30:00', 'd': '40:11:11:11'}
显示默认的yaml行为保持不变,但是您的私有加载器类正确处理这些字符串。