yaml.dump在多行字符串中添加不需要的换行符

时间:2017-07-10 05:40:46

标签: python serialization pyyaml

我有一个多行字符串:

>>> import credstash
>>> d = credstash.getSecret('alex_test_key', region='ap-southeast-2')

要查看原始数据(前162个字符):

>>> credstash.getSecret('alex_test_key', region='ap-southeast-2')[0:162]
u'-----BEGIN RSA PRIVATE KEY-----\nMIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx\nxk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45\n'

>>> print d[0:162]                                                                                                                                                                                          
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45

我写了一个YAML文件:

>>> import yaml
>>> with open('foo.yaml', 'w') as f:                                                                                                                                                                        
...     yaml.dump(d, f, default_flow_style=False, explicit_start=True)
... 

现在看起来像这样:

$ head -5 foo.yaml 
--- !!python/unicode '-----BEGIN RSA PRIVATE KEY-----

  MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx

  xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45

即。每行有两个换行符。

现在,如果我把它读回一个字符串,我发现在往返中一切都没问题:

>>> with open('foo.yaml', 'r') as f:
...     d = yaml.load(f)
... 
>>> print d[0:162]
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45

(但我不明白为什么。)

我真正的问题是,如果人们阅读这个YAML文件,他们可能会像我一样假设我的程序破坏了私钥文件的格式。

有没有办法使用yaml.dump来输出没有其他换行符的内容?

4 个答案:

答案 0 :(得分:6)

如果这是进入YAML文件的唯一内容,那么您可以使用选项default_style='|'进行转储,该选项为您的所有标量提供块样式文字(可能不是您想要的)。

你的字符串,不包含特殊字符(需要\转义和双引号),因为换行符PyYAML决定用单引号表示。在单引号样式中,双换行是表示在表示的字符串中出现的单个换行的方式。这在加载时“撤消”,但确实不太可读。

如果您希望逐个获取块样式文字,可以执行多项操作:

  • 使用文字标量块样式调整Representer以输出所有带有嵌入换行符的字符串(假设它们不需要\转义特殊字符,这将强制使用双引号)

    import sys
    import yaml
    
    x = u"""\
    -----BEGIN RSA PRIVATE KEY-----
    MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
    xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
    ...
    """
    
    yaml.SafeDumper.org_represent_str = yaml.SafeDumper.represent_str
    
    def repr_str(dumper, data):
        if '\n' in data:
            return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|')
        return dumper.org_represent_str(data)
    
    yaml.add_representer(str, repr_str, Dumper=yaml.SafeDumper)
    
    yaml.safe_dump(dict(a=1, b='hello world', c=x), sys.stdout)
    
  • 创建一个string的子类,它有一个特殊的表示符。您应该可以从hereherehere获取代码:

    import sys
    import yaml
    
    class PSS(str):
        pass
    
    x = PSS("""\
    -----BEGIN RSA PRIVATE KEY-----
    MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
    xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
    ...
    """)
    
    def pss_representer(dumper, data):
            style = '|'
            # if sys.versioninfo < (3,) and not isinstance(data, unicode):
            #     data = unicode(data, 'ascii')
            tag = u'tag:yaml.org,2002:str'
            return dumper.represent_scalar(tag, data, style=style)
    
    yaml.add_representer(PSS, pss_representer, Dumper=yaml.SafeDumper)
    
    yaml.safe_dump(dict(a=1, b='hello world', c=x), sys.stdout)
    
  • 使用ruamel.yaml

    import sys
    from ruamel.yaml import YAML
    from ruamel.yaml.scalarstring import PreservedScalarString as pss
    
    x = pss("""\
    -----BEGIN RSA PRIVATE KEY-----
    MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
    xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
    ...
    """)
    
    yaml = YAML()
    
    yaml.dump(dict(a=1, b='hello world', c=x), sys.stdout)
    

所有这些都给出了:

a: 1
b: hello world
c: |
  -----BEGIN RSA PRIVATE KEY-----
  MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
  xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
  ...

请注意,没有必要指定default_flow_style=False,因为文字标量只能以块样式显示。

答案 1 :(得分:3)

根据Anthon的回答,我找到了一些其他选项的文档,我可以通过此链接here传递给default_style

我能找到代表我所有数据的最佳折衷方案是:

with open('foo.yaml', 'w') as f:
  yaml.safe_dump(secrets, f, explicit_start=True, default_style='\"', width=4096)

这导致YAML文件看起来像:

---
"alex_test": "yyyyyyyy"
"alex_test_key": "-----BEGIN RSA PRIVATE KEY-----\nMIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx\nxk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45\ni6LLqO6aixyhvabSy7r1bP8QBrHWUIEZRerrw0TlhuKHDoFpmRAjAAIZ/5q9PSxg\n1yCwTVMlMvBiRksPsKi0fcA/v8G+yqFBL7IeaNCPSoa/3ZdgPWbh9P69DyOlB97a\nh1+0Jmh1gtAhyiz1/hmiN7LAclKHyOOnTEEyIMJioZqJURshKdF85RKILgw2X8Lp\n78mO5VyvvGxo3BNjVr0BOrSJ3t17ugijROx3HwIBIwKCAQAUGq1uvLxGnUErgvQg\ncbk/3kcVJutAJNVXM45eNd05ygpg30JwFUWJMMwBnMch8rjz+NtMvDTpcMT2oRDM\nYn9K4u/VxfXj55kLRsuhgesYJ1vFfu79VxjFVkfCx/CbOi9TSooQqCXx8fxtTOTo\nvF1Z4VWAlxLj/HbD+hGg6jy+Iwq/8HWsHN/VFPqhNqdKvzXGOtyynSZBOUf7upMX\nPh4REE4hYMZwdDnl+NRNmm8XA9TOE+Uf8WLDooKcXjp70CES0ehiC+VD0wG5JEVQ\nbZmDTdBxPcQsO31sNwRwUIX0J4K4Z9npa3dJdRqXJuof48RLzSGwM42eJzmTRNSw\n6I77AoGBAP9LO2A2ZAD7LJBKe48GE8wzkgaQd9vc3RImwrMAXPMVP9wdKR4m/X73\ngWxQ1QbueTtBRaNwkF8l9+Iham3H3kAbBONsbvJIO9Co0n1k+S9mutO1ZWfTMWZp\nIfMz2lncLonxXCXnDndzXtTjcqHeZFmSmDZZZugPXYWtC5N2ic3pAoGBAOsypk5z\na9FG3H46TIjYKyV0Z/R0Hvrp8w+AXdogKyHh0nj9Sevr+JMgOR4ayqYUKGG3sRtM\nzyoWCJ+Wb7Rd0olc2SeouQYSzk2wFKvnnq5o0Q8YZIYkiQN82FXoN2jcELdcVdW6\n1VJuUk9K3nDe+Gz6dkHZnthFC6usL15pHs/HAoGBAIqWjfJmq1D9YVWkxrtbEg/E\nOVQFSGFpRM9W3rjxkYtGDLlRqJtW/qQCs/j4rihVkkS9CIvspiUF+5gDgusjW2Sg\n9AZuEFejji9xltZbYrNVBlWrnXLgXKVPA80qxv2UyM6KVpg7miOWZq4VElffIIhl\nhdRcaxBC2v9skUFsPC3zAoGBAN3CCoR7dEj5q1Jxe1x0C2x1EY62oN3yhhXuD1ih\n/MgssIC0TQMDDvEeYb1Mde0LsQutMfUrKbn3hHk2EYzNfVzxJIR6gpCypUHvKW7h\nst71HOJY087vP1sPT6F0jAPILQSnhCFJwdFgtAGeXLOQZpKjAckPA3t0TNUQD2ek\n8SpNAoGAfQrNfepCTbc/9BCv/sJLLMEdlB/PyzenucBeXKfsfSU6+hYM14+gLp7+\nmOgoaM7F4UkqzJTRDQJnYo1NowRHjs0xHJoQoXzlV43ZkCmTwKtZ/9APLi060Md1\n+fDJX+yvxnZsY5hw6cYwC3C/axS9jq63oQ7i8FXwG/a0breCGu8=\n-----END RSA PRIVATE KEY-----"

我宁愿使用ruamel.yaml,但是这段代码必须在我只能使用我可用的默认Python包的环境中运行。

答案 2 :(得分:0)

更好的ruamel.yaml选项可将多行字符串输出为块:

from ruamel.yaml.representer import RoundTripRepresenter
from ruamel.yaml import YAML

multiline_string = """\
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
...
"""


def repr_str(dumper: RoundTripRepresenter, data: str):
    if '\n' in data:
        return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
    return dumper.represent_scalar('tag:yaml.org,2002:str', data)


yaml = YAML()
yaml.representer.add_representer(str, repr_str)

with open('file.yaml', 'w') as fp:
    yaml.dump({'a': 1, 'b': 'hello world', 'c': multiline_string}, fp)

输出:

a: 1
b: hello world
c: |
  -----BEGIN RSA PRIVATE KEY-----
  MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
  xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
  ...

答案 3 :(得分:0)

假设您有一个名为“privateKey.pem”的文件和您的密钥,您可以将内容提取到带有“|”的 YAML 多行块中和 2 个空格缩进:

# Convert To yaml multiline block
cat privateKey.pem | sed -e '1 i PRIVATE_KEY: |' -e 's#^#  #g' > parameters.yaml

cat <<_EOF_ >>parameters.yaml
SSH_HOST: 'host.name.or.ip'
SSH_USER: 'cloud-user'
_EOF_

你会得到类似的东西:

PRIVATE_KEY: |
  -----BEGIN OPENSSH PRIVATE KEY-----
  b3BlbnNzaC1rZXkt....
  ...
SSH_HOST: 'host.name.or.ip'

然后在你的 shell 脚本中从 parameters.yaml 文件中读取

# Read values From Yaml using “PyYAML”
FILE=parameters.yaml

KEY=PRIVATE_KEY
python3 -c "from yaml import load; f = open('$FILE'); y = load(f); print(y['$KEY'])" > priv.key
chmod 600 priv.key

KEY=SSH_HOST
ssh_host=$(python3 -c "from yaml import load; f = open('$FILE'); y = load(f); print(y['$KEY'])")

KEY=SSH_USER
ssh_user=$(python3 -c "from yaml import load; f = open('$FILE'); y = load(f); print(y['$KEY'])")

# test
ssh -i priv.key ${ssh_user}@${ssh_host} "hostname"

更多详情见https://yaml-multiline.info/