MySQL编码的问题

时间:2017-02-23 16:50:21

标签: encoding utf-8 character-encoding

我的填充物存在严重问题。字符未正确存储。我的代码:

def _create_Historial(self):

    datos = [self.DB_HOST, self.DB_USER, self.DB_PASS, self.DB_NAME]

    conn = MySQLdb.connect(*datos)
    cursor = conn.cursor()
    cont = 0

    with open('principal/management/commands/Historial_fichajes_jugadores.csv', 'rv') as csvfile:
        historialReader = csv.reader(csvfile, delimiter=',')
        for row in historialReader:
            if cont == 0:
                cont += 1
            else:
                #unicodedata.normalize('NFKD', unicode(row[4], 'latin1')).encode('ASCII', 'ignore'),
                cursor.execute('''INSERT INTO principal_historial(jugador_id, temporada, fecha, ultimoClub, nuevoClub, valor, coste) VALUES (%s,%s,%s,%s,%s,%s,%s)''',
                               (round(float(row[1]))+1,row[2], self.stringToDate(row[3]), unicode(row[4],'utf-8'), row[5], self.convertValue(row[6]), str(row[7])))

    conn.commit()
    cursor.close()
    conn.close()

El error es el siguiente:

Traceback (most recent call last):
File "/home/tfg/pycharm-2016.3.2/helpers/pycharm/django_manage.py",    line 41, in <module>
run_module(manage_file, None, '__main__', True)
File "/usr/lib/python2.7/runpy.py", line 188, in run_module
fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 82, in _run_module_code
mod_name, mod_fname, mod_loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/tfg/TrabajoFinGrado/demoTFG/manage.py", line 10, in  <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python2.7/dist-    packages/django/core/management/__init__.py", line 443, in   execute_from_command_line
utility.execute()
File "/usr/local/lib/python2.7/dist -packages/django/core/management/__init__.py", line 382, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 196, in run_from_argv
self.execute(*args, **options.__dict__)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 232, in execute
output = self.handle(*args, **options)
File "/home/tfg/TrabajoFinGrado/demoTFG/principal/management/commands/populate_db.py", line 230, in handle
self._create_Historial()
File "/home/tfg/TrabajoFinGrado/demoTFG/principal/management/commands/populate_db.py", line 217, in _create_Historial
(round(float(row[1]))+1,row[2], self.stringToDate(row[3]), unicode(row[4],'utf-8'), row[5], self.convertValue(row[6]), str(row[7])))
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 187, in execute
query = query % tuple([db.literal(item) for item in args])
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 278, in literal
return self.escape(o, self.encoders)
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 208, in unicode_literal
return db.literal(u.encode(unicode_literal.charset))
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 6-7: ordinal not in range(256)

人物展示如下:尼古拉斯·奥塔门迪,加利克里希......

当我在python的shell上打印characteros时,它的显示正确。

对不起我的英文:(

1 个答案:

答案 0 :(得分:0)

好的,我会保持简短。

  1. 您应该在代码的早期将编码数据/ strs转换为Unicodes。不要内联.decode()/.encode()/unicode()

  2. 在Python 2.7中打开文件时,它以二进制模式打开。您应该使用io.open(filename, encoding='utf-8'),它会将其作为文本读取,并将其从utf-8解码为Unicodes。

  3. Python 2.7 CSV模块不兼容Unicode。您应该安装https://github.com/ryanhiebert/backports.csv

  4. 您需要告诉MySQL驱动程序您要通过Unicodes并使用UTF-8进行连接。这是通过在连接字符串中添加以下内容来完成的:

    charset='utf8',
    use_unicode=True
    
  5. 将Unicode字符串传递给MySQL。使用u''前缀可以避免麻烦的隐含转换。

  6. 您的所有CSV数据都已是str / Unicode str。没有必要转换它。

  7. 总而言之,您的代码将如下所示:

    from backports import csv
    import io
    datos = [self.DB_HOST, self.DB_USER, self.DB_PASS, self.DB_NAME]
    
    conn = MySQLdb.connect(*datos, charset='utf8', use_unicode=True)
    cursor = conn.cursor()
    cont = 0
    
    with io.open('principal/management/commands/Historial_fichajes_jugadores.csv', 'r', encoding='utf-8') as csvfile:
        historialReader = csv.reader(csvfile, delimiter=',')
        for row in historialReader:
            if cont == 0:
                cont += 1
            else:
                cursor.execute(u'''INSERT INTO principal_historial(jugador_id, temporada, fecha, ultimoClub, nuevoClub, valor, coste) VALUES (%s,%s,%s,%s,%s,%s,%s)''',
                      round(float(row[1]))+1,row[2], self.stringToDate(row[3]), row[4], row[5], self.convertValue(row[6]), row[7]))
    
    conn.commit()
    cursor.close()
    conn.close()
    

    您可能还想查看https://stackoverflow.com/a/35444608/1554386,其中包含Python 2.7 Unicodes的内容。