Sqoop - 如果使用order by和limit 1,则导入最大值查询失败

时间:2016-06-05 19:56:33

标签: mysql hadoop hive hdfs sqoop

我有一个简单的Sqoop查询,我用它来导入表的id的最大值并将其存储在HDFS中。存储在HDFS中是客户要求的,所以我出于多种原因而去寻找它。

获得我使用的最大值

sqoop import \
--connect jdbc:mysql://abc.com/sqoopemp \
--username root \
--password root \
--e 'select max(id) from emp WHERE $CONDITIONS' \
--target-dir sqooplastmax \
--m 1 \
--driver com.mysql.jdbc.Driver

以上查询为我提供了所需的答案,但出于性能原因我正在考虑使用以下内容

sqoop import \
--connect jdbc:mysql://abc.com/sqoopemp \
--username root \
--password root \
--query 'select id from emp oder by id limit 1 WHERE $CONDITIONS' \
--target-dir sqooplastmax1 \
--m 1 \
--driver com.mysql.jdbc.Driver

此查询给出了错误,以下是错误

    Warning: /usr/hdp/2.4.0.0-169/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/06/05 15:50:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/06/05 15:50:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/06/05 15:50:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
16/06/05 15:50:06 INFO manager.SqlManager: Using default fetchSize of 1000
16/06/05 15:50:06 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/06/05 15:50:06 INFO manager.SqlManager: Executing SQL statement: select id from emp order by id desc limit 1 WHERE  (1 = 0)
16/06/05 15:50:06 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE  (1 = 0)' at line 1
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE  (1 = 0)' at line 1
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.Util.getInstance(Util.java:386)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
    at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2283)
    at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:758)
    at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:767)
    at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
    at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
    at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:234)
    at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:304)
    at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
    at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
    at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
16/06/05 15:50:06 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter
    at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651)
    at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:244)

问题显然与WHERE $ CONDITIONS有关,但我不知道我在哪里失踪。第一个自由格式查询工作,但当我使用它与order by并限制它不起作用。对此有任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:3)

您的查询顺序似乎不正确(加上它有错字):

select id from emp oder by id limit 1 WHERE $CONDITIONS

应阅读:

select id from emp  WHERE $CONDITIONS order by id limit 1

如果外部设置$CONDITIONS,这看起来也不安全:任何人都可以使用所谓的SQL注入在$CONDITIONS中插入任何代码。

处理SQL注入的最佳方法是将$ CONDITION拆分为两部分:

1)列名称 2)值(s)

如果Sqoop不允许这样的参数化查询:

select id from emp  WHERE some_column=:columnValue order by id limit 1

然后你可以采取两个方向:

A)在sqoop调用前添加验证码

或者

B)在MySQL中创建一个存储过程,以便在执行之前检查查询的有效性。