解析SQL以使用python提取列和表名

时间:2018-04-19 08:26:40

标签: python sql

我想编写一个代码,该代码将从没有JOIN关键字的查询中提取表名和列名。相反,笛卡尔连接(,)如下所示:

SELECT suppliers.supplier_name, subquery1.total_amt
FROM suppliers
,
(SELECT supplier_id, SUM(orders.amount) AS total_amt
FROM orders
GROUP BY supplier_id) subquery1
WHERE subquery1.supplier_id = suppliers.supplier_id;"""

我尝试使用下面的代码,但它不能在python 2.7中工作,因为我得到错误:Bool对象在第21行无法调用:

    import itertools
    import sqlparse

    from sqlparse.sql import IdentifierList, Identifier
    from sqlparse.tokens import Keyword, DML


    def is_subselect(parsed):
        if not parsed.is_group():
            return False
        for item in parsed.tokens:
            if item.ttype is DML and item.value.upper() == 'SELECT':
                return True
        return False


    def extract_from_part(parsed):
        from_seen = False
        print 'hi'
        for item in parsed.tokens:
            if item.is_group():
                print 'group'
                for x in extract_from_part(item):
                    yield x
            if from_seen:
                print 'from'
                if is_subselect(item):
                    for x in extract_from_part(item):
                        yield x
                elif item.ttype is Keyword and item.value.upper() in ['ORDER', 'GROUP', 'BY', 'HAVING']:
                    from_seen = False
                    StopIteration
                else:
                    yield item
            if item.ttype is Keyword and item.value.upper() == 'FROM':
                from_seen = True


    def extract_table_identifiers(token_stream):
        for item in token_stream:
            if isinstance(item, IdentifierList):
                for identifier in item.get_identifiers():
                    value = identifier.value.replace('"', '').lower()
                    yield value
            elif isinstance(item, Identifier):
                value = item.value.replace('"', '').lower()
                yield value


    def extract_tables(sql):
        # let's handle multiple statements in one sql string
        extracted_tables = []
        statements = (sqlparse.parse(sql))

        for statement in statements:
            # print statement.get_type()
            if statement.get_type() != 'UNKNOWN':
                stream = extract_from_part(statement)
                print stream
                extracted_tables.append(set(list(extract_table_identifiers(stream))))
        return list(itertools.chain(*extracted_tables))


    # strsql = """
    # SELECT p.product_name, inventory.quantity
    # FROM products p join inventory
    # ON p.product_id = inventory.product_id;
    # """

    strsql = """SELECT suppliers.supplier_name, subquery1.total_amt
    FROM suppliers
    ,
     (SELECT supplier_id, SUM(orders.amount) AS total_amt
      FROM orders
      GROUP BY supplier_id) subquery1
    WHERE subquery1.supplier_id = suppliers.supplier_id;"""
    extract_tables(strsql)

错误:这是追溯:

Traceback (most recent call last):
  File "4.py", line 77, in <module>
    extract_tables(strsql)
  File "4.py", line 60, in extract_tables
    extracted_tables.append(set(list(extract_table_identifiers(stream))))
  File "4.py", line 40, in extract_table_identifiers
    for item in token_stream:
  File "4.py", line 21, in extract_from_part
    if item.is_group():
TypeError: 'bool' object is not callable

1 个答案:

答案 0 :(得分:1)

感谢@Gphilo的回答:

从回溯看来,is_group实际上不是一个函数,而是一个简单的bool属性。尝试用item.is_group替换item.is_group()并查看是否有所改进