我有一个文本文件需要导入数据库,可能使用SQL来执行此操作。我只是试图以下列格式将地址放入表中:
地址City,State,Zip。
以下是我尝试使用的文字的更改示例:
DOE, JANE
id:0123456465 alt id:246465 165165 department:TEST
*Address Information-- Mailing Address:1
address1:
City, State:TEST CITY, STATE
Line1:14566 Test Avenue
Zip:12345
address2:
none
address3:
none
我只需要地址1中的信息,但有40,000个条目。有没有人有办法让我有效地完成这项工作?
答案 0 :(得分:0)
作为python实现的一个例子,
import re
idPat = re.compile("id:(\d+)")
cityPat = re.compile("City, State:(.*)")
zipPat = re.compile("Zip:(\d{5})")
with open('data.txt', 'r') as f:
data = f.read().split('\n') # or split('\r\n') in windows
ID = CITY_STATE = ZIP = None
# assuming you have all of the data in an array, with each element being a line
foreach line in data:
if ID is None:
m = idPat.search(line)
if m is not None:
ID = m.group(0)
CITY_STATE = ZIP = None
else:
if CITY_STATE is None:
m = cityPat.search(line)
if m is not None:
CITY_STATE = m.group(0)
ZIP = None
elif ZIP is None:
m = zipPat.search(line)
if m is not None:
ZIP = m.group(0)
print "UPDATE `table` SET `CityState`='%s', `Zip`='%s' WHERE `id`='%';"%(CITY_STATE, ZIP, ID)
ID = CITY_STATE = ZIP = None
这只会打印出一堆sql语句。您可以将其批量导入到phpmyadmin(例如)控制台中。或者python具有连接到许多数据库类型的绑定。
答案 1 :(得分:0)
如果结构始终是严格的,并且值只有ASCII字符,那么像这样的东西会产生一个sql文件
fin = open('janedoe.txt', 'r')
fout = open('janedoe.sql', 'w')
inAddress1 = 0
address = {}
dataNo = 0 # just want to see how much is processed
for line in fin:
line = line.strip()
if inAddress1:
if line == 'address2:':
fout.write( "insert into Address (CityState, Line1, Zip) " +
"values ('%s', '%s', '%s');\n" %
(address['CityState'], address['Line1'], address['Zip']))
inAddress1 = 0
else:
key, value = line.split(':', 1)
key = key.translate(None, ', ')
value = value.replace("'", "''") # sql escape string
address[key] = value
elif line == 'address1:':
dataNo += 1
if 0 == dataNo % 100:
print dataNo
inAddress1 = 1
address = {}
fin.close()
fout.close()