我尝试使用python解析一些sql脚本来提取受给定查询影响的表和列的名称。这些脚本的格式有所不同,其中一些使用这种格式:
Select col1, col2
FROM Table1
和其他时候一样:
Select
col1, col2
FROM
Table1
我的代码看起来像这样:
tables1 = []
for line in file1:
line2 = line.split()
for i in xrange(len(line2)):
if line2[i] = 'FROM':
tables1.append(line2[i + 1])
目前我正在逐行解析,但如果' FROM'需要某种迭代器来转到下一行。是行中的最后一个字符串。有关如何做到这一点的任何建议吗?
答案 0 :(得分:1)
通过使用re
模块,可以轻松解决此问题:
import re
def parse_query(query):
res = re.match('select\s+(?P<cols>.+)\s+from\s+(?P<tbl>.+)\s*;?', query, re.I|re.M)
if res is not None:
cols = [c.strip() for c in res.groupdict()['cols'].split(',')]
return {'columns': cols, 'table': res.groupdict()['tbl']}
print parse_info_from_query('''Select col1, col2
FROM Table1''')
print parse_info_from_query('''Select
col1, col2
FROM
Table1''')
答案 1 :(得分:1)
使用sqlparse模块。
转到https://pypi.python.org/pypi/sqlparse,然后下载tar.gz 并安装:
> python setup.py install
然后创建测试文件sqlExamples.sql
:
Select col1, col2
FROM Table1;
Select
col1, col2
FROM
Table1
;
SELECT col3 FROM Table3;
然后让我们看看解析器是否可以帮助我们.. 这不是非常有效的脚本,它是为学习而编写的:
import sqlparse
print "--------------------------------------------------------------------------"
print "loading the file into a string"
print "--------------------------------------------------------------------------"
with open ("sqlExamples.sql", "r") as myfile:
sql = myfile.read()
print sql
print "--------------------------------------------------------------------------"
print "Example 1: using the parser to reformat SQL to a standardized format"
print "--------------------------------------------------------------------------"
formattedSQL = sqlparse.format(sql, reindent=True, keyword_case='upper')
print formattedSQL
print "--------------------------------------------------------------------------"
print "Example 1.A: reformatting statements, to single lines, for string analysis"
print "--------------------------------------------------------------------------"
words = " ".join(formattedSQL.split()).replace('; ', ';\n')
print words
print "--------------------------------------------------------------------------"
print "Example 2: using the parser more directly, to extract coloumns"
print "--------------------------------------------------------------------------"
parsed = sqlparse.parse(sql)
coloumns = []
tables = []
for SQL in parsed:
#For my test cases, the name of the statement and the affected table is the same thing..
if( SQL.get_name() not in tables):
tables.append( SQL.get_name() )
#for debugging print token list:
for token in SQL.tokens:
if token.is_whitespace():
continue
if "SELECT" in SQL.get_type() and token.is_group():
cols = token.value.split(",")
for col in cols:
if col.strip() not in coloumns:
coloumns.append(col.strip())
break
print "tables:" + str(tables)
print "cols:" + str(coloumns)