Python从当前行或下一行提取信息

时间:2015-07-08 13:40:06

标签: python

我尝试使用python解析一些sql脚本来提取受给定查询影响的表和列的名称。这些脚本的格式有所不同,其中一些使用这种格式:

Select col1, col2
FROM Table1

和其他时候一样:

 Select 
       col1, col2
    FROM 
       Table1

我的代码看起来像这样:

tables1 = []
for line in file1:
   line2 = line.split()
   for i in xrange(len(line2)):
      if line2[i] = 'FROM':
         tables1.append(line2[i + 1])

目前我正在逐行解析,但如果' FROM'需要某种迭代器来转到下一行。是行中的最后一个字符串。有关如何做到这一点的任何建议吗?

2 个答案:

答案 0 :(得分:1)

通过使用re模块,可以轻松解决此问题:

import re

def parse_query(query):
    res = re.match('select\s+(?P<cols>.+)\s+from\s+(?P<tbl>.+)\s*;?', query, re.I|re.M)
    if res is not None:
        cols = [c.strip() for c in res.groupdict()['cols'].split(',')]
        return {'columns': cols, 'table': res.groupdict()['tbl']}

print parse_info_from_query('''Select col1, col2
                               FROM Table1''')

print parse_info_from_query('''Select 
                                   col1, col2
                               FROM 
                                    Table1''')

答案 1 :(得分:1)

使用sqlparse模块。

转到https://pypi.python.org/pypi/sqlparse,然后下载tar.gz 并安装:

> python setup.py install

然后创建测试文件sqlExamples.sql

Select col1, col2
FROM Table1;

 Select 
       col1, col2
    FROM 
        Table1
       ;
SELECT col3 FROM Table3;

然后让我们看看解析器是否可以帮助我们.. 这不是非常有效的脚本,它是为学习而编写的:

import sqlparse

print "--------------------------------------------------------------------------"
print "loading the file into a string"
print "--------------------------------------------------------------------------"
with open ("sqlExamples.sql", "r") as myfile:
    sql = myfile.read()
print sql

print "--------------------------------------------------------------------------"
print "Example 1: using the parser to reformat SQL to a standardized format"
print "--------------------------------------------------------------------------"
formattedSQL = sqlparse.format(sql, reindent=True, keyword_case='upper')
print formattedSQL

print "--------------------------------------------------------------------------"
print "Example 1.A: reformatting statements, to single lines, for string analysis"
print "--------------------------------------------------------------------------"
words = " ".join(formattedSQL.split()).replace('; ', ';\n')
print words

print "--------------------------------------------------------------------------"
print "Example 2: using the parser more directly, to extract coloumns"
print "--------------------------------------------------------------------------"
parsed = sqlparse.parse(sql)
coloumns = []
tables = []
for SQL in parsed:
    #For my test cases, the name of the statement and the affected table is the same thing..
    if( SQL.get_name() not in tables):
        tables.append( SQL.get_name() )

    #for debugging print token list:
    for token in SQL.tokens:
        if token.is_whitespace():
            continue
        if "SELECT" in SQL.get_type() and token.is_group():
            cols = token.value.split(",")
            for col in cols:
                if col.strip() not in coloumns:
                    coloumns.append(col.strip())
            break

print "tables:" + str(tables)    
print "cols:" + str(coloumns)