Question

我想将html页面转换为pdf。为此，我从excel访问数据并将其存储在python字典中。之后，我格式化下面的字符串。

将python变量数据写入文件：

 html_file.write( html_rcc_string%(row["B_6.2OwnerName"],
                           row["B_6.3OwnerNameH"],))

在上面的代码html_rcc_string中包含html代码，即

<table>
    <tr>
        <td>Owner name</td>
        <td>Owner name in hindi</td>
    </tr>
    <tr>
         <td>%s</td>
         <td>%s</td>
    </tr>
</table>

当我提供一个名字用印地语的字典变量时，它会返回以下错误。

UnicodeEncodeError: 'ascii' codec can't encode characters in position 4273-4279: ordinal not in range(128)

我用Google搜索了但我没有找到任何东西。如何在印地语中显示用户名？有什么建议吗？

Answer 1

从优秀的 Pragmatic Unicode -or- How Do I Stop the Pain? 考虑这个建议：制作一个＆＃34; Unicode三明治 - 外部的字节，内部的unicode＆＃34;。也就是说，在您读取它的瞬间将所有输入转换为Unicode，并在您编写它的瞬间将所有输出转换为utf8。

将该逻辑应用于您的程序，我有：

# coding: utf8
row = {
  "B_6.2OwnerName": u'ABHAY',
  "B_6.3OwnerNameH": u'अभय' }

html_rcc_string = u'''
<table>
    <tr>
        <td>Owner name</td>
        <td>Owner name in hindi</td>
    </tr>
    <tr>
         <td>%s</td>
         <td>%s</td>
    </tr>
</table>
'''

with open('/tmp/html_file.html', 'w') as html_file:
    html_file.write( (html_rcc_string%(row["B_6.2OwnerName"],
                                      row["B_6.3OwnerNameH"],)).encode('utf8') )

还有其他方法可以调用utf8编码器，但重点仍然是：确保您的所有程序内数据都是unicode，而不是str。在最后一刻，只有这样，你转换为utf8编码str。

UnicodeEncodeError：＆＃39; ascii＆＃39;编解码器不能编码位置4273-4279中的字符：序数不在范围内（128）

1 个答案: