Google Translate API - 来自翻译文本的EncodeError

时间:2016-08-26 10:59:20

标签: python google-api encode

我在编码文本方面存在一些问题,在JSON中,来自使用Google Translate API的翻译,我也是Python和Google API的初学者。

下面你可以找到一个基本的脚本,它从CSV中提取结构的ID,从数据库中选择英文描述并尝试在另一个表中写下翻译的描述。

翻译部分后:

t = service.translations().list(source='%s' % trans, \
        target='%s' % lang, q=[message2t]).execute()
translated = t['translations'][0]['translatedText']

我的unicode变量translated里面有脏字符(我有像德语或法语这样的语言问题)。我不知道如何获得正确的字符。

实际上,当我尝试将字符串写入数据库时​​,我会收到此错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 225: ordinal not in range(128)

这是完整的基本代码:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from googleapiclient.discovery import build
import _mysql
import time

trans = 'en'
langs = [ 'de', 'dk', 'es', 'fr', 'it', 'nl', 'no', 'pg', 'pl', 'sw' ]

# Google API Environment
key = 'MYKEY'
service = build('translate', 'v2', developerKey=key)

# Open DB connection
db = _mysql.connect(user='MYUSER',
                    passwd='MYPASSWORD',
                    host='MYRDS',
                    port=3306,
                    db='MYDB')

for lang in langs:
  print 'Finding structures w/o description in {} language'.format(lang.upper())

  with open('nodesc_%s.csv' % lang, 'r') as structures:
    for structure in structures:
      id_str = structure.split('\t')[0]

      text2t = """SELECT `text` FROM `texts` WHERE
                  `str_ID`='%s' AND
                  `type`='description' AND
                  `lang`='%s';""" % (id_str, trans)
      db.query(text2t)
      r = db.store_result()
      message2t = r.fetch_row()[0][0]

      # Check if there is a description for real
      if len(message2t) is not 0:
        t = service.translations().list(source='%s' % trans, \
            target='%s' % lang, q=[message2t]).execute()
        translated = t['translations'][0]['translatedText']
        now = time.strftime("%Y-%m-%d %H:%M:%S")

        texttranslated = """INSERT INTO `descriptions`
                            (`ID_desc`, `ID_str`, `text`, `lang`, `human_date`, `google_date`)
                            VALUES (NULL, '%s', '%s', '%s', '0000-00-00 00:00:00', '%s')""" \
                            % (id_str, translated, lang, now)
        db.query(texttranslated)

      else:
        print 'Structure with id {} have no description in english'.format(id_str)

# Close DB connection
db.close()

0 个答案:

没有答案