Sqoop-Hive导入

时间:2016-04-15 13:18:19

标签: hadoop hive hdfs sqoop

通过rdbms将数据从sqoop导入配置单元时,我遇到了一个问题。请不要将此标记为重复

我将数据从oracle 12c db导入Hive。数据是在hdfs中创建的,但是没有创建hive表,即在我导入的表的名称中在/ user / hive / warehouse中创建文件夹,但是当我转到hive编辑器并检查此表时没有给我看。

但是,如果我在导入数据(创建表...)之前创建此表的模式,现在如果我通过运行sqoop import导入数据,那么也可以在hdfs和hive表中找到数据。

以下是我正在做的事情的详细信息

1。初始化

    usermod -a -G supergroup venkat
    export SQOOP_HOME=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop
    export HIVE_HOME=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hive
    export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ojdbc7.jar:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/hive/lib/*
    export HADOOP_USER_NAME=hdfs
    export PATH=$PATH:$HIVE_HOME/bin

2。运行sqoop导入

sqoop import --connect jdbc:oracle:thin:@bigdatadev2:1521/orcl --username BDD1 --password oracle123 --table EMP -m 1 --hive-import --hive-table emp

以下是日志:

    16/04/15 18:10:10 DEBUG orm.CompilationManager: Could not rename /tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.java to /root/./EMP.java
    16/04/15 18:10:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.jar
    16/04/15 18:10:10 DEBUG orm.CompilationManager: Scanning for .class files in directory: /tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb
    16/04/15 18:10:10 DEBUG orm.CompilationManager: Got classfile: /tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.class -> EMP.class
    16/04/15 18:10:10 DEBUG orm.CompilationManager: Finished writing jar file /tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.jar
    16/04/15 18:10:10 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:10 INFO manager.OracleManager: Time zone has been set to GMT
    16/04/15 18:10:10 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:10 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:10 INFO manager.OracleManager: Time zone has been set to GMT
    16/04/15 18:10:10 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:10 INFO mapreduce.ImportJobBase: Beginning import of EMP
    16/04/15 18:10:10 DEBUG util.ClassLoaderStack: Checking for existing class: EMP
    16/04/15 18:10:10 DEBUG util.ClassLoaderStack: Attempting to load jar through URL: jar:file:/tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.jar!/
    16/04/15 18:10:10 DEBUG util.ClassLoaderStack: Previous classloader is sun.misc.Launcher$AppClassLoader@5a42bbf4
    16/04/15 18:10:10 DEBUG util.ClassLoaderStack: Testing class in jar: EMP
    16/04/15 18:10:10 DEBUG util.ClassLoaderStack: Loaded jar into current JVM: jar:file:/tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.jar!/
    16/04/15 18:10:10 DEBUG util.ClassLoaderStack: Added classloader for jar /tmp/sqoop-root/compile/c6516678b105260186a4b5daf7552fcb/EMP.jar: java.net.FactoryURLClassLoader@37d3d232
    16/04/15 18:10:10 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
    16/04/15 18:10:10 DEBUG db.DBConfiguration: Securing password into job credentials store
    16/04/15 18:10:10 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:10 INFO manager.OracleManager: Time zone has been set to GMT
    16/04/15 18:10:10 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:10 DEBUG mapreduce.DataDrivenImportJob: Using table class: EMP
    16/04/15 18:10:10 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat: class com.cloudera.sqoop.mapreduce.db.OracleDataDrivenDBInputFormat
    16/04/15 18:10:11 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/sqoop-1.4.6-cdh5.5.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ojdbc6.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/sqoop-1.4.6-cdh5.5.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/sqoop-1.4.6-cdh5.5.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-jackson.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/avro-mapred-hadoop2.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-avro.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/xz-1.0.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/jackson-annotations-2.3.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/commons-compress-1.4.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/fastutil-6.3.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ojdbc6.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/jackson-databind-2.3.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/avro.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/kite-data-mapreduce.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/kite-data-hive.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/kite-hadoop-compatibility.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/jackson-core-asl-1.8.8.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/commons-io-1.4.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-encoding.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/kite-data-core.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/hsqldb-1.8.0.10.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-column.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-common.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ojdbc7.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/commons-jexl-2.1.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/paranamer-2.3.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/commons-logging-1.1.3.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-hadoop.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/parquet-format.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/slf4j-api-1.7.5.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/opencsv-2.3.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/jackson-mapper-asl-1.8.8.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/snappy-java-1.0.4.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/commons-codec-1.4.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/jackson-core-2.3.1.jar
    16/04/15 18:10:11 DEBUG mapreduce.JobBase: Adding to job classpath: file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/sqoop/lib/ant-contrib-1.0b3.jar
    16/04/15 18:10:11 INFO client.RMProxy: Connecting to ResourceManager at bigdata/10.103.25.39:8032
    16/04/15 18:10:19 DEBUG db.DBConfiguration: Fetching password from job credentials store
    16/04/15 18:10:19 INFO db.DBInputFormat: Using read commited transaction isolation
    16/04/15 18:10:19 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '1=1' and upper bound '1=1'
    16/04/15 18:10:19 INFO mapreduce.JobSubmitter: number of splits:1
    16/04/15 18:10:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460530815947_0198
    16/04/15 18:10:21 INFO impl.YarnClientImpl: Submitted application application_1460530815947_0198
    16/04/15 18:10:21 INFO mapreduce.Job: The url to track the job: http://bigdata:8088/proxy/application_1460530815947_0198/
    16/04/15 18:10:21 INFO mapreduce.Job: Running job: job_1460530815947_0198
    16/04/15 18:10:30 INFO mapreduce.Job: Job job_1460530815947_0198 running in uber mode : false
    16/04/15 18:10:30 INFO mapreduce.Job:  map 0% reduce 0%
    16/04/15 18:10:38 INFO mapreduce.Job:  map 100% reduce 0%
    16/04/15 18:10:39 INFO mapreduce.Job: Job job_1460530815947_0198 completed successfully
    16/04/15 18:10:39 INFO mapreduce.Job: Counters: 30
        File System Counters
            FILE: Number of bytes read=0
            FILE: Number of bytes written=137941
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=87
            HDFS: Number of bytes written=12
            HDFS: Number of read operations=4
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
        Job Counters 
            Launched map tasks=1
            Other local map tasks=1
            Total time spent by all maps in occupied slots (ms)=4157
            Total time spent by all reduces in occupied slots (ms)=0
            Total time spent by all map tasks (ms)=4157
            Total vcore-seconds taken by all map tasks=4157
            Total megabyte-seconds taken by all map tasks=2128384
        Map-Reduce Framework
            Map input records=3
            Map output records=3
            Input split bytes=87
            Spilled Records=0
            Failed Shuffles=0
            Merged Map outputs=0
            GC time elapsed (ms)=56
            CPU time spent (ms)=2240
            Physical memory (bytes) snapshot=213250048
            Virtual memory (bytes) snapshot=2180124672
            Total committed heap usage (bytes)=134742016
        File Input Format Counters 
            Bytes Read=0
        File Output Format Counters 
            Bytes Written=12
    16/04/15 18:10:39 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 28.3024 seconds (0.424 bytes/sec)
    16/04/15 18:10:39 INFO mapreduce.ImportJobBase: Retrieved 3 records.
    16/04/15 18:10:39 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@5a42bbf4
    16/04/15 18:10:39 DEBUG hive.HiveImport: Hive.inputTable: EMP
    16/04/15 18:10:39 DEBUG hive.HiveImport: Hive.outputTable: emp
    16/04/15 18:10:39 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM EMP t WHERE 1=0
    16/04/15 18:10:39 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM EMP t WHERE 1=0
    16/04/15 18:10:39 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:39 INFO manager.OracleManager: Time zone has been set to GMT
    16/04/15 18:10:39 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
    16/04/15 18:10:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM EMP t WHERE 1=0
    16/04/15 18:10:39 DEBUG manager.SqlManager: Found column EMP_ID of type [12, 5, 0]
    16/04/15 18:10:39 DEBUG manager.SqlManager: Found column EMP_NAME of type [12, 100, 0]
    16/04/15 18:10:39 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:10:39 DEBUG hive.TableDefWriter: Create statement: CREATE TABLE IF NOT EXISTS `emp` ( `EMP_ID` STRING, `EMP_NAME` STRING) COMMENT 'Imported by sqoop on 2016/04/15 18:10:39' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE
    16/04/15 18:10:39 DEBUG hive.TableDefWriter: Load statement: LOAD DATA INPATH 'hdfs://bigdata:8020/user/hdfs/EMP' INTO TABLE `emp`
    16/04/15 18:10:39 INFO hive.HiveImport: Loading uploaded data into Hive
    16/04/15 18:10:39 DEBUG hive.HiveImport: Using in-process Hive instance.
    16/04/15 18:10:39 DEBUG util.SubprocessSecurityManager: Installing subprocess security manager

    Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/hive-common-1.1.0-cdh5.5.1.jar!/hive-log4j.properties
    OK
    Time taken: 0.782 seconds
    Loading data to table default.emp
    Table default.emp stats: [numFiles=1, numRows=0, totalSize=12, rawDataSize=0]
    OK
    Time taken: 0.676 seconds
    You have new mail in /var/spool/mail/root
    [root@bigdatadev1 ~]# 

第3。当我运行下面的sqoop import命令时,我收到以下错误(显示最后几行日志)

sqoop import --connect jdbc:oracle:thin:@bigdatadev2:1521/orcl --username BDD1 --password oracle123 --table EMP -m 1 --create-hive-table --hive-import --hive-table emp

    16/04/15 18:17:02 INFO mapreduce.Job: The url to track the job: http://bigdata:8088/proxy/application_1460530815947_0200/
    16/04/15 18:17:02 INFO mapreduce.Job: Running job: job_1460530815947_0200
    16/04/15 18:17:11 INFO mapreduce.Job: Job job_1460530815947_0200 running in uber mode : false
    16/04/15 18:17:11 INFO mapreduce.Job:  map 0% reduce 0%
    16/04/15 18:17:17 INFO mapreduce.Job:  map 100% reduce 0%
    16/04/15 18:17:19 INFO mapreduce.Job: Job job_1460530815947_0200 completed successfully
    16/04/15 18:17:19 INFO mapreduce.Job: Counters: 30
        File System Counters
            FILE: Number of bytes read=0
            FILE: Number of bytes written=137940
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=87
            HDFS: Number of bytes written=12
            HDFS: Number of read operations=4
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
        Job Counters 
            Launched map tasks=1
            Other local map tasks=1
            Total time spent by all maps in occupied slots (ms)=4067
            Total time spent by all reduces in occupied slots (ms)=0
            Total time spent by all map tasks (ms)=4067
            Total vcore-seconds taken by all map tasks=4067
            Total megabyte-seconds taken by all map tasks=2082304
        Map-Reduce Framework
            Map input records=3
            Map output records=3
            Input split bytes=87
            Spilled Records=0
            Failed Shuffles=0
            Merged Map outputs=0
            GC time elapsed (ms)=58
            CPU time spent (ms)=2350
            Physical memory (bytes) snapshot=221048832
            Virtual memory (bytes) snapshot=2178715648
            Total committed heap usage (bytes)=135790592
        File Input Format Counters 
            Bytes Read=0
        File Output Format Counters 
            Bytes Written=12
    16/04/15 18:17:19 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 21.9242 seconds (0.5473 bytes/sec)
    16/04/15 18:17:19 INFO mapreduce.ImportJobBase: Retrieved 3 records.
    16/04/15 18:17:19 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@5a42bbf4
    16/04/15 18:17:19 DEBUG hive.HiveImport: Hive.inputTable: EMP
    16/04/15 18:17:19 DEBUG hive.HiveImport: Hive.outputTable: emp
    16/04/15 18:17:19 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM EMP t WHERE 1=0
    16/04/15 18:17:19 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM EMP t WHERE 1=0
    16/04/15 18:17:19 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:17:19 INFO manager.OracleManager: Time zone has been set to GMT
    16/04/15 18:17:19 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
    16/04/15 18:17:19 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM EMP t WHERE 1=0
    16/04/15 18:17:19 DEBUG manager.SqlManager: Found column EMP_ID of type [12, 5, 0]
    16/04/15 18:17:19 DEBUG manager.SqlManager: Found column EMP_NAME of type [12, 100, 0]
    16/04/15 18:17:19 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@bigdatadev2:1521/orcl/BDD1
    16/04/15 18:17:19 DEBUG hive.TableDefWriter: Create statement: CREATE TABLE `emp` ( `EMP_ID` STRING, `EMP_NAME` STRING) COMMENT 'Imported by sqoop on 2016/04/15 18:17:19' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE
    16/04/15 18:17:19 DEBUG hive.TableDefWriter: Load statement: LOAD DATA INPATH 'hdfs://bigdata:8020/user/hdfs/EMP' INTO TABLE `emp`
    16/04/15 18:17:19 INFO hive.HiveImport: Loading uploaded data into Hive
    16/04/15 18:17:19 DEBUG hive.HiveImport: Using in-process Hive instance.
    16/04/15 18:17:19 DEBUG util.SubprocessSecurityManager: Installing subprocess security manager

    Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/hive-common-1.1.0-cdh5.5.1.jar!/hive-log4j.properties
    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Table emp already exists)

4。 现在,当我创建表格的模式时,创建表格emp ...'然后导入数据,我在hdfs和hive表中获取数据。

只有表模式存在时才能使用sqoop吗?它无法在其上创建新表?

我在导入命令行中遗漏了什么。请不要将此标记为重复

1 个答案:

答案 0 :(得分:0)

通常这些错误是由于许可错误造成的。错误配置。考虑到这不是增量数据提取,以下应该可行。

我添加了仓库 - dir& hive-overwrite选项。请删除Hive&中现有的emp表。执行以下命令,将path / to / dir替换为hdfs中的相应仓库目录。

  sqoop import --connect jdbc:oracle:thin:@bigdatadev2:1521/orcl --username BDD1 --password oracle123 --table EMP -m 1 --warehouse-dir "path/to/dir" --hive-import --hive-overwrite --hive-table emp