我想将python中的以下unicode转换为字典:
Version: 67
Build Number: 123master
Project Name: git+ssh://git@stash.xyz.com:1999/ns/x-y.git
Git Url: origin/master
Git Branch: 0223445as5njn34nfk6kg
perforce_url:
//project//url//
artifacts:
"./": "./"
exclude:
- manifest.yml
- node_modules
- RPMS
- .git
- build-toolbox
>>> x
' Version: 67\nBuild Number: 123master\nProject Name: git+ssh://git@stash.xyz.com:1999/ns/x-y.git\nGit Url: origin/master\nGit Branch: 0223445as5njn34nfk6kg\nperforce_url:\n //project//url//\nartifacts:\n "./": "./"\nexclude:\n - manifest.yml\n - node_modules\n - RPMS\n - .git\n - build-toolbox '
>>>
>>>
>>> dict(map(lambda x: x.split(':'), x.splitlines()))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: dictionary update sequence element #2 has length 4; 2 is required
dict(item.split(":") for item in x.splitlines())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: dictionary update sequence element #2 has length 4; 2 is required
我认为问题出在以下项目
Project Name: git+ssh://git@stash.xyz.com:1999/ns/x-y.git
上面的 git url中有一个:
。因此,当我尝试拆分时,出现以下错误:
ValueError:字典更新序列元素2的长度为4;需要2个
dict(item.split(":") for item in data.splitlines())
dict(map(lambda x: x.split(':'), x.splitlines()))
预期结果:
{"version" : 67 , " Build Number" : "123master", "Project Name": "xyz", "Git Url" : "origin/master", "Git Branch": "0223445as5njn34nfk6kg", "perforce_url":
"//project//url//", "artifacts" : " "./": "./" "}
实际结果:
ValueError: dictionary update sequence element #2 has length 4; 2 is required
答案 0 :(得分:1)
您的直接问题是冒号-您可以使用line.split(':', 1)
您还将在列表元素上遇到问题。输入文件是yaml吗?
import yaml
data = """
Version: 67
Build Number: 123master
Project Name: git+ssh://git@stash.xyz.com:1999/ns/x-y.git
Git Url: origin/master
Git Branch: 0223445as5njn34nfk6kg
perforce_url:
//project//url//
artifacts:
"./": "./"
exclude:
- manifest.yml
- node_modules
- RPMS
- .git
- build-toolbox
"""
yaml.load(data)
答案 1 :(得分:1)
您可以编写自己的解析器(像yaml
):
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
data = """
Version: 67
Build Number: 123master
Project Name: git+ssh://git@stash.xyz.com:1999/ns/x-y.git
Git Url: origin/master
Git Branch: 0223445as5njn34nfk6kg
perforce_url:
//project//url//
artifacts:
"./": "./"
exclude:
- manifest.yml
- node_modules
- RPMS
- .git
- build-toolbox
"""
grammar = Grammar(
r"""
entries = (item / list / ws)+
item = key equal ws* (!listsep value) newline?
list = listkey listitem+
listkey = key equal hs* newline
listitem = hs* listsep hs* value newline?
key = ~"^[^:]+"m
equal = ":"
listsep = "-"
value = ~".+"
ws = (hs / newline)+
newline = ~"[\r\n]"
hs = ~"[\t ]+"
"""
)
tree = grammar.parse(data)
class YAMLVisitor(NodeVisitor):
def generic_visit(self, node, visited_children):
return visited_children or node
def visit_key(self, node, visited_children):
return node.text.strip()
def visit_value(self, node, visited_children):
return node.text.strip()
def visit_listkey(self, node, visited_children):
key, *_ = visited_children
return key
def visit_listitem(self, node, visited_children):
*_, value, _ = visited_children
return value
def visit_list(self, node, visited_children):
key, values = visited_children
return (key, values)
def visit_item(self, node, visited_children):
key, *_, lst, nl = visited_children
value = lst[1]
return (key, value)
def visit_entries(self, node, visited_children):
output = dict([child[0] for child in visited_children if isinstance(child[0], tuple)])
return output
yaml = YAMLVisitor()
output = yaml.visit(tree)
print(output)
{'Version': '67', 'Build Number': '123master', 'Project Name': 'git+ssh://git@stash.xyz.com:1999/ns/x-y.git', 'Git Url': 'origin/master', 'Git Branch': '0223445as5njn34nfk6kg', 'perforce_url': '//project//url//', 'artifacts': '"./": "./"', 'exclude': ['manifest.yml', 'node_modules', 'RPMS', '.git', 'build-toolbox']}