我试图阅读YAML文件并将其转换为字典文件。我在将文件加载到dict变量时发现了一个问题。
我试图搜索类似的问题。 stackoverflow中的一个回复是用'\\xa0'
替换每个字符' '
。我试过line = line.replace('\\xa0',' ')
。这个程序不适用于Python 2.7版本。我尝试使用Python 3它运行正常。
import yaml
import sys
yaml_dir = "/root/tools/test_case/"
#file_name = "TC_CFD_SR.yml"
file_name = "TC_QB.yml"
tc_file_name = yaml_dir + file_name
def write(file,content):
file = open(file,'a')
file.write(content)
file.close()
def verifyYmlFile(yml_file):
data = {}
with open(yml_file, 'r') as fin:
for line in fin:
line = line.replace('\\xa0',' ')
write('anand-yaml.yml',line)
with open('anand-yaml.yml','r') as fin:
data = yaml.load(fin)
return data
if __name__ == '__main__':
data = {}
print "verifying yaml"
data= verifyYmlFile(tc_file_name)
错误:
[root@anand-harness test_case]# python verify_yaml.py
verifying yaml
Traceback (most recent call last):
File "verify_yaml.py", line 29, in <module>
data= verifyYmlFile(tc_file_name)
File "verify_yaml.py", line 23, in verifyYmlFile
data = yaml.load(fin)
File "/usr/lib64/python2.6/site-packages/yaml/__init__.py", line 71, in load
return loader.get_single_data()
File "/usr/lib64/python2.6/site-packages/yaml/constructor.py", line 37, in get_single_data
node = self.get_single_node()
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 55, in compose_document
node = self.compose_node(None, None)
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 82, in compose_node
node = self.compose_sequence_node(anchor)
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 111, in compose_sequence_node
node.value.append(self.compose_node(node, index))
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 84, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/usr/lib64/python2.6/site-packages/yaml/composer.py", line 64, in compose_node
if self.check_event(AliasEvent):
File "/usr/lib64/python2.6/site-packages/yaml/parser.py", line 98, in check_event
self.current_event = self.state()
File "/usr/lib64/python2.6/site-packages/yaml/parser.py", line 449, in parse_block_mapping_value
if not self.check_token(KeyToken, ValueToken, BlockEndToken):
File "/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 116, in check_token
self.fetch_more_tokens()
File "/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 244, in fetch_more_tokens
return self.fetch_single()
File "/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 653, in fetch_single
self.fetch_flow_scalar(style='\'')
File "/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 667, in fetch_flow_scalar
self.tokens.append(self.scan_flow_scalar(style))
File "/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 1156, in scan_flow_scalar
chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))
File "/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 1196, in scan_flow_scalar_non_spaces
while self.peek(length) not in u'\'\"\\\0 \t\r\n\x85\u2028\u2029':
File "/usr/lib64/python2.6/site-packages/yaml/reader.py", line 91, in peek
self.update(index+1)
File "/usr/lib64/python2.6/site-packages/yaml/reader.py", line 165, in update
exc.encoding, exc.reason)
yaml.reader.ReaderError: 'utf8' codec can't decode byte #xa0: invalid start byte
in "anand-yaml.yml", position 3246
我错过了什么?
答案 0 :(得分:0)
字符序列“\\xa0
”不是您在邮件中看到的问题,问题是序列“\xa0
”(请注意反斜杠未转义)。
你的替换线应该是:
line = line.replace('\xa0',' ')
以规避问题。
如果您知道格式是什么,您可以自己进行正确的转换,但这不是必需的,或者上述修补不是结构性解决方案。最好是以正确的方式生成YAML文件(它们默认为UTF-8,因此它应包含正确的UTF-8)。如果没有相应的BOM(yaml库解释IIRC),它可能是UTF-16。
s1 = 'abc\\xa0xyz'
print(repr(s1))
u1 = s1.decode('utf-8') # this works fine
s = 'abc\xa0xyz'
print(repr(s))
u = s.decode('utf-8') # this throws an error