我有一个python文件,用于在ms sql中创建和填充表。唯一的问题是,如果存在任何非ascii字符或单个撇号(并且每个都有很多),代码就会中断。虽然我可以运行替换函数来消除撇号的字符串,但我宁愿保持它们完好无损。我也尝试将数据转换为utf-8,但也没有运气。
以下是我收到的错误消息:
"'ascii' codec can't encode character u'\2013' in position..." (for non-ascii characters)
和单引号
class 'pyodbc.ProgrammingError'>: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server] Incorrect syntax near 'S, 230 X 90M.; Eligibilty....
当我尝试在utf-8中编码字符串时,我得到以下错误消息:
<type 'exceptions.UnicodeDecodeError'>: ascii' codec can't decode byte 0xe2 in position 219: ordinal not in range(128)
下面包含python代码。我相信发生此中断的代码中的点是在以下行之后:InsertValue = str(row.GetValue(CurrentField ['Name']))。
# -*- coding: utf-8 -*-
import pyodbc
import sys
import arcpy
import arcgisscripting
gp = arcgisscripting.create(9.3)
SQL_KEYWORDS = ['PERCENT', 'SELECT', 'INSERT', 'DROP', 'TABLE']
#SourceFGDB = '###'
#SourceTable = '###'
SourceTable = sys.argv[1]
TempInputName = sys.argv[2]
SourceTable2 = sys.argv[3]
#---------------------------------------------------------------------------------------------------------------------
# Target Database Settings
#---------------------------------------------------------------------------------------------------------------------
TargetDatabaseDriver = "{SQL Server}"
TargetDatabaseServer = "###"
TargetDatabaseName = "###"
TargetDatabaseUser = "###"
TargetDatabasePassword = "###"
# Get schema from FGDB table.
# This should be an ordered list of dictionary elements [{'FGDB_Name', 'FGDB_Alias', 'FGDB_Type', FGDB_Width, FGDB_Precision, FGDB_Scale}, {}]
if not gp.Exists(SourceTable):
print ('- The source does not exist.')
sys.exit(102)
#### Should see if it is actually a table type. Could be a Feature Data Set or something...
print(' - Processing Items From : ' + SourceTable)
FieldList = []
Field_List = gp.ListFields(SourceTable)
print(' - Getting number of rows.')
result = gp.GetCount_management(SourceTable)
Number_of_Features = gp.GetCount_management(SourceTable)
print(' - Number of Rows: ' + str(Number_of_Features))
print(' - Getting fields.')
Field_List1 = gp.ListFields(SourceTable, 'Layer')
Field_List2 = gp.ListFields(SourceTable, 'Comments')
Field_List3 = gp.ListFields(SourceTable, 'Category')
Field_List4 = gp.ListFields(SourceTable, 'State')
Field_List5 = gp.ListFields(SourceTable, 'Label')
Field_List6 = gp.ListFields(SourceTable, 'DateUpdate')
Field_List7 = gp.ListFields(SourceTable, 'OBJECTID')
for Current_Field in Field_List1 + Field_List2 + Field_List3 + Field_List4 + Field_List5 + Field_List6 + Field_List7:
print(' - Field Found: ' + Current_Field.Name)
if Current_Field.AliasName in SQL_KEYWORDS:
Target_Name = Current_Field.Name + '_'
else:
Target_Name = Current_Field.Name
print(' - Alias : ' + Current_Field.AliasName)
print(' - Type : ' + Current_Field.Type)
print(' - Length : ' + str(Current_Field.Length))
print(' - Scale : ' + str(Current_Field.Scale))
print(' - Precision: ' + str(Current_Field.Precision))
FieldList.append({'Name': Current_Field.Name, 'AliasName': Current_Field.AliasName, 'Type': Current_Field.Type, 'Length': Current_Field.Length, 'Scale': Current_Field.Scale, 'Precision': Current_Field.Precision, 'Unique': 'UNIQUE', 'Target_Name': Target_Name})
# Create table in SQL Server based on FGDB table schema.
cnxn = pyodbc.connect(r'DRIVER={SQL Server};SERVER=###;DATABASE=###;UID=sql_webenvas;PWD=###')
cursor = cnxn .cursor()
#### DROP the table first?
try:
DropTableSQL = 'DROP TABLE dbo.' + TempInputName + '_Test;'
print DropTableSQL
cursor.execute(DropTableSQL)
dbconnection.commit()
except:
print('WARNING: Can not drop table - may not exist: ' + TempInputName + '_Test')
CreateTableSQL = ('CREATE TABLE ' + TempInputName + '_Test '
' (Layer varchar(500), Comments varchar(5000), State int, Label varchar(500), DateUpdate DATETIME, Category varchar(50), OBJECTID int)')
cursor.execute(CreateTableSQL)
cnxn.commit()
# Cursor through each row in the FGDB table, get values, and insert into the SQL Server Table.
# We got Number_of_Features earlier, just use that.
Number_Processed = 0
print(' - Processing ' + str(Number_of_Features) + ' features.')
rows = gp.SearchCursor(SourceTable)
row = rows.Next()
while row:
if Number_Processed % 10000 == 0:
print(' - Processed ' + str(Number_Processed) + ' of ' + str(Number_of_Features))
InsertSQLFields = 'INSERT INTO ' + TempInputName + '_Test ('
InsertSQLValues = 'VALUES ('
for CurrentField in FieldList:
InsertSQLFields = InsertSQLFields + CurrentField['Target_Name'] + ', '
InsertValue = str(row.GetValue(CurrentField['Name']))
if InsertValue in ['None']:
InsertValue = 'NULL'
# Use an escape quote for the SQL.
InsertValue = InsertValue.replace("'","' '")
if CurrentField['Type'].upper() in ['STRING', 'CHAR', 'TEXT']:
if InsertValue == 'NULL':
InsertSQLValues = InsertSQLValues + "NULL, "
else:
InsertSQLValues = InsertSQLValues + "'" + InsertValue + "', "
elif CurrentField['Type'].upper() in ['GEOMETRY']:
## We're not handling geometry transfers at this time.
if InsertValue == 'NULL':
InsertSQLValues = InsertSQLValues + '0' + ', '
else:
InsertSQLValues = InsertSQLValues + '1' + ', '
else:
InsertSQLValues = InsertSQLValues + InsertValue + ', '
InsertSQLFields = InsertSQLFields[:-2] + ')'
InsertSQLValues = InsertSQLValues[:-2] + ')'
InsertSQL = InsertSQLFields + ' ' + InsertSQLValues
## print InsertSQL
cursor.execute(InsertSQL)
cnxn.commit()
Number_Processed = Number_Processed + 1
row = rows.Next()
print(' - Processed all ' + str(Number_Processed))
del row
del rows
答案 0 :(得分:3)
詹姆斯,我相信真正的问题是你并没有全面使用Unicode。尝试执行以下操作:
另外,请提供完整的堆栈跟踪和例外名称。
答案 1 :(得分:1)
我将使用我的通灵调试技巧,并说你正试图str()
ify something并使用ascii编解码器出错。你真正应该做的是使用utf-8编解码器,如下所示:
insert_value_uni = unicode(row.GetValue(CurrentField['Name']))
InsertValue = insert_value_uni.encode('utf-8')
答案 2 :(得分:0)
或者您可以认为只允许使用ASCII并使用名为Unicode Hammer
的名称答案 3 :(得分:0)
通常,您希望在数据输入上转换为unicode,并在输出时转换为所需的编码。
如果你这样做会更容易找到你的问题。这意味着将所有字符串更改为unicode,“INSERT INTO”改为u'INSERT INTO'。 (注意字符串前面的“u”) 然后当您发送要执行的字符串转换为所需的编码时,“utf8”。
cursor.execute(InsertSQL.encode("utf8")) # Where InsertSQL is unicode
此外,您应该将编码字符串添加到源代码的顶部。 这意味着将编码cookie添加到文件的前两行之一:
#!/usr/bin/python
# -*- coding: <encoding name> -*-
如果您从文件中提取数据以构建字符串,则可以使用codecs.open在加载时自动从特定编码转换为unicode。
答案 4 :(得分:0)
当我将str()转换为unicode时,解决了问题。一个简单的答案,我感谢大家对此的帮助。