Google App Engine中的Python问题 - UTF-8和ASCII

时间:2011-08-21 14:21:52

标签: python google-app-engine utf-8 ascii

所以在过去的几天里,我一直在尝试在App Engine中学习Python。但是,我遇到了ASCII和UTF编码的许多问题。最新鲜的问题如下:

我在“云中的代码”一书中有一段简单的聊天室代码

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime


# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
    self.user = user
    self.message = msg
    self.time = datetime.datetime.now()

def __str__(self):
    return "%s (%s): %s" % (self.user, self.time, self.message)

Messages = []

class ChatRoomPage(webapp.RequestHandler):
def get(self):
    self.response.headers["Content-Type"] = "text/html"
    self.response.out.write("""
       <html>
         <head>
           <title>MarkCC's AppEngine Chat Room</title>
         </head>
         <body>
           <h1>Welcome to MarkCC's AppEngine Chat Room</h1>
           <p>(Current time is %s)</p>
       """ % (datetime.datetime.now()))
    # Output the set of chat messages
    global Messages
    for msg in Messages:
        self.response.out.write("<p>%s</p>" % msg)
    self.response.out.write("""
       <form action="" method="post">
       <div><b>Name:</b> 
       <textarea name="name" rows="1" cols="20"></textarea></div>
       <p><b>Message</b></p>
       <div><textarea name="message" rows="5" cols="60"></textarea></div>
       <div><input type="submit" value="Send ChatMessage"></input></div>
       </form>
     </body>
   </html>
   """)
 # END: MainPage    
 # START: PostHandler
def post(self):
    chatter = self.request.get("name")
    msg = self.request.get("message")
    global Messages
    Messages.append(ChatMessage(chatter, msg))
    # Now that we've added the message to the chat, we'll redirect
    # to the root page, which will make the user's browser refresh to
    # show the chat including their new message.
    self.redirect('/')        
# END: PostHandler




# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])


def main():
run_wsgi_app(chatapp)

if __name__ == "__main__":
main()
# END: Frame

用英语说得好。但是,当我添加一些非标准字符时,会出现各种问题

首先,为了让事物能够在HTML中显示字符,我添加了元标记 - charset = UTF-8“等

奇怪的是,如果您输入非标准字母,程序会很好地处理它们,并显示它们没有问题。但是,如果我使用脚本输入任何非ascii字母到web布局,它将无法加载。我发现添加utf-8编码行会起作用。所以我添加了(# - - 编码:utf-8 - - )。这还不够。当然我忘了以UTF-8格式保存文件。在那之后程序开始运行。

这将是故事的好结局,唉......

不起作用

长话短说这段代码:

# -*- coding: utf-8 -*-
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime


# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
    self.user = user
    self.message = msg
    self.time = datetime.datetime.now()

def __str__(self):
    return "%s (%s): %s" % (self.user, self.time, self.message)

Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
    self.response.headers["Content-Type"] = "text/html"
    self.response.out.write("""
       <html>
         <head>
           <title>Witaj w pokoju czatu MarkCC w App Engine</title>
           <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
         </head>
         <body>
           <h1>Witaj w pokoju czatu MarkCC w App Engine</h1>
           <p>(Dokladny czas Twojego logowania to: %s)</p>
       """ % (datetime.datetime.now()))
    # Output the set of chat messages
    global Messages
    for msg in Messages:
        self.response.out.write("<p>%s</p>" % msg)
    self.response.out.write("""
       <form action="" method="post">
       <div><b>Twój Nick:</b> 
       <textarea name="name" rows="1" cols="20"></textarea></div>
       <p><b>Twoja Wiadomość</b></p>
       <div><textarea name="message" rows="5" cols="60"></textarea></div>
       <div><input type="submit" value="Send ChatMessage"></input></div>
       </form>
     </body>
   </html>
   """)
# END: MainPage    
# START: PostHandler
def post(self):
    chatter = self.request.get(u"name")
    msg = self.request.get(u"message")
    global Messages
    Messages.append(ChatMessage(chatter, msg))
    # Now that we've added the message to the chat, we'll redirect
    # to the root page, which will make the user's browser refresh to
    # show the chat including their new message.
    self.redirect('/')        
# END: PostHandler




# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])


def main():
run_wsgi_app(chatapp)

if __name__ == "__main__":
main()
# END: Frame

无法处理我在聊天应用程序运行时编写的任何内容。它加载但我输入消息的那一刻(即使只使用标准字符)我收到

File "D:\Python25\lib\StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in       range(128) 

错误消息。换句话说,如果我想能够使用应用程序中的任何字符,我就不能在我的界面中放入非英语字符。或者反过来说,只有当我不在utf-8中编码文件时,我才能在应用程序中使用非英文字符。如何使它们一起工作?

2 个答案:

答案 0 :(得分:2)

你的字符串包含unicode字符,但它们不是unicode字符串,它们是字节字符串。您需要为每个人添加前缀u(如u"foo"中所示),以使其成为unicode字符串。如果确保所有字符串都是Unicode字符串,则应该消除该错误。

您还应该在Content-Type标题中指定编码而不是元标记,如下所示:

self.response.headers['Content-Type'] = 'text/html; charset=UTF-8'

请注意,如果您使用模板系统而不是使用Python代码内联编写HTML,那么您的生活将会轻松得多。

答案 1 :(得分:1)

@Thomas K. 感谢您的指导。多亏了你,我能够想出 - 也许 - 如你所说 - 一个小小的解决方案 - 所以答案的功劳应该归你所有。以下代码行:

Messages.append(ChatMessage(chatter, msg))

应该是这样的:

Messages.append(ChatMessage(chatter.encode( "utf-8" ), msg.encode( "utf-8" )))

基本上我必须将所有utf-8字符串编码为ascii。