在下文中,我使用translate()
来消除字符串中的标点符号。我在translate
遇到了很多问题,因为它不适用于unicode。但现在我注意到该脚本在开发服务器中运行良好,但在生产服务器中引发了错误。
请求由Chrome扩展程序发送到谷歌应用引擎。有任何建议我如何解决这个问题,以便相同的脚本在生产服务器中工作?或者是否有其他方法可以在不使用translate()
的情况下消除标点符号。
登录生产服务器:
2011-10-11 06:18:10.384
get_rid_of_unicode: ajax: how to use xmlhttprequest
E 2011-10-11 06:18:10.384
expected a character buffer object
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 703, in __call__
handler.post(*groups)
File "/base/data/home/apps/ting-1/1.353888928453510037/ting.py", line 2073, in post
user_tag_list_case = f1.striplist(main().split(" "))
File "/base/data/home/apps/ting-1/1.353888928453510037/ting.py", line 2055, in main
title_no_punctuation = get_rid_of_unicode.translate(None, string.punctuation)
TypeError: expected a character buffer object
相同的脚本在开发服务器中没有问题:
INFO 2011-10-11 13:15:49,154 ting.py:2052] get_rid_of_unicode: how to use xmlhttprequest
INFO 2011-10-11 13:15:49,154 ting.py:2057] title_no_punctuation: how to use xmlhttprequest
剧本:
def main():
title_lowercase = title.lower()
title_without_possessives = remove_possessive(title_lowercase)
title_without_double_quotes = remove_double_quotes(title_without_possessives)
get_rid_of_unicode = title_without_double_quotes.encode('utf-8')
title_no_punctuation = get_rid_of_unicode.translate(None, string.punctuation)
back_to_unicode = unicode(title_no_punctuation, "utf-8")
clean_title = remove_stop_words(back_to_unicode, f1.stop_words)
return clean_title
user_tag_list = []
user_tag_list_case = f1.striplist(main().split(" "))
for tag in user_tag_list_case:
user_tag_list.append(tag.lower())
答案 0 :(得分:2)
Google App Engine运行Python 2.5.2。 str.translate()
需要256个字符的字符串作为第一个参数;自Python 2.6起,None
只是一个允许的值。