我是Python新手,我试图循环非常大的CSV文件(4 GB)并将其放入MSSQL服务器。当前的SQL工具似乎没有帮助!
附件是我的剧本。我在运行它时遇到错误。任何帮助,将不胜感激。
MSSQL数据库退出。登录和密码是正确的。我还为windows安装了pymssql模块
E:\ Python27> python -x parsedata_mssql.py Traceback(最近一次调用最后一次): 文件“parsedata_mssql.py”,第28行,in 除了mdb.Error,e: NameError:名称'mdb'未定义
以下是我的代码:
#! /usr/bin/python
import csv
import sys
import _mssql
fields = [
(0, 'name'),
(1, 'street'),
(2, 'city'),
(3, 'state'),
(4, 'zip'),
(5, 'u1'),
(6, 'u2'),
(7, 'phone1'),
(8, 'phone2'),
(9, 'contactname'),
(10, 'relationship'),
(11, 'gender'),
(12, 'u3'),
(13, 'u4'),
(14, 'industry'),
]
try:
dbconn = _mssql.connect(server='localhost\SQLEXPRESS', user='sa',
password='password', database='2007usdata')
except mdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit(1)
with open('2007usdata.csv', 'rb') as infile:
reader = csv.reader(infile)
count = 0
for line in reader:
print "\n\nProcessing\n"
print line
if line:
column_names = ','.join([name for (id, name) in fields])
value_placeholders = (len(fields) - 1) * '%s, ' + '%s'
query = "INSERT INTO info(%s) VALUES(%s)" % (column_names, value_placeholders)
try:
dbconn.execute_non_query(query, line)
count += 1
dbconn.commit()
except mdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit(1)
dbconn.close()
print "\n\nDone: processed %d lines" % (count)