Question

我是python的新手，我需要一些手来处理这段代码：

此代码工作正常，它会根据需要转换字符串。

# -*- coding: utf-8 -*-
import sys
import arabic_reshaper
from bidi.algorithm import get_display

reshaped_text = arabic_reshaper.reshape(u' الحركات')
bidi_text = get_display(reshaped_text)
print >>open('out', 'w'), reshaped_text.encode('utf-8') # This is ok

当我尝试从文件中读取字符串时出现以下错误：

# -*- coding: utf-8 -*-
import sys
import arabic_reshaper
from bidi.algorithm import get_display

with open ("/home/nemo/Downloads/mpcabd-python-arabic-reshaper-552f3f4/data.txt" , "r") as myfile:
data=myfile.read().replace('\n', '')    
reshaped_text = arabic_reshaper.reshape(data)
bidi_text = get_display(reshaped_text)
print >>open('out', 'w'), reshaped_text.encode('utf-8')

UnicodeDecodeError：'ascii'编解码器无法解码位置0的字节0xd8：序号不在范围内（128）。

任何一只手

由于

Answer 1

方法decode（）使用注册的编解码器对字符串进行解码编码。它默认为默认字符串编码。

当您阅读 utf-8编码文件时，您需要使用string.decode('utf8')

的写： 的

data = 'my data' with open("file.txt" , "w") as f: f.write(data.encode('utf-8'))

的读： 的

with open("file.txt" , "r") as f: data = f.read().decode('utf-8')

Answer 2

您还可以使用内置open function的可选encoding参数：

with open("/home/nemo/Downloads/mpcabd-python-arabic-reshaper-552f3f4/data.txt",
          'rt',
          encoding='utf8') as f:

从输入文件转换字符串

2 个答案: