我正在研究Apache Drill的SQLAlchemy方言,而且我遇到了一个我似乎无法弄清楚的问题。
基本问题是SQLAlchemy正在生成如下所示的查询:
SELECT `field1`, `field2`
FROM dfs.test.data.csv LIMIT 100
失败,因为data.csv
需要围绕它进行反引号,如下所示:
SELECT `field1`, `field2`
FROM dfs.test.`data.csv` LIMIT 100
我已经在方言编译器中定义了各种visit_()
函数,但这些函数似乎没有效果。
答案 0 :(得分:1)
这花了一些时间来弄清楚,我想我会发布结果,以便如果其他人遇到这个问题,他们将有一个关于如何解决它的参考点。
这是最终的工作代码:
https://github.com/JohnOmernik/sqlalchemy-drill/blob/master/sqlalchemy_drill/base.py
以下是最终解决问题的方法:
def __init__(self, dialect):
super(DrillIdentifierPreparer, self).__init__(dialect, initial_quote='`', final_quote='`')
def format_drill_table(self, schema, isFile=True):
formatted_schema = ""
num_dots = schema.count(".")
schema = schema.replace('`', '')
# For a file, the last section will be the file extension
schema_parts = schema.split('.')
if isFile and num_dots == 3:
# Case for File + Workspace
plugin = schema_parts[0]
workspace = schema_parts[1]
table = schema_parts[2] + "." + schema_parts[3]
formatted_schema = plugin + ".`" + workspace + "`.`" + table + "`"
elif isFile and num_dots == 2:
# Case for file and no workspace
plugin = schema_parts[0]
formatted_schema = plugin + "." + schema_parts[1] + ".`" + schema_parts[2] + "`"
else:
# Case for non-file plugins or incomplete schema parts
for part in schema_parts:
quoted_part = "`" + part + "`"
if len(formatted_schema) > 0:
formatted_schema += "." + quoted_part
else:
formatted_schema = quoted_part
return formatted_schema