从postgresql导出和导入Sqoop安装

时间:2014-06-11 13:08:49

标签: postgresql hadoop sqoop

我刚刚安装了sqoop并正在测试它。我尝试使用sqoop将一些数据从hdfs导出到postgresql。当我运行它时会抛出以下异常:java.io.IOException: Can't export data, please check task tracker logs。我认为安装可能也存在问题。

文件内容为:

ustNU 45
MB1bA 0
gNbCO 76
iZP10 39
B2aoo 45
SI7eG 93
5sC4k 60
2IhFV 2
u2A48 16
yvy6R 51
LNhsV 26
mZ2yn 65
80Gp3 43
Wk5Ag 85
VUfyp 93
P077j 94
f1Oj5 11
LxJkg 72
0H7NP 99
Dk406 25
g4KRp 76
Fw3U0 80
6LD59 1
07KHx 91
F1S88 72
Bnb0v 85
A2qM7 79
Z6cAt 81
0M3DO 23
m0s09 44
KIvwd 13
GNUD0 78
um93a 20
19bHv 75
4Of3s 75
5hFen 16

这是posgres表:

    Table "public.mysort"
 Column |  Type   | Modifiers 
--------+---------+-----------
 name   | text    | 
 marks  | integer | 

sqoop命令是:

sqoop export --connect jdbc:postgresql://localhost/testdb --username akshay --password akshay --table mysort -m 1 --export-dir MySort/input

接着是错误:

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/06/11 18:28:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/06/11 18:28:06 INFO manager.SqlManager: Using default fetchSize of 1000
14/06/11 18:28:06 INFO tool.CodeGenTool: Beginning code generation
14/06/11 18:28:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "mysort" AS t LIMIT 1
14/06/11 18:28:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hduser/compile/0402ad4b5cf7980040264af35de406cb/mysort.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/06/11 18:28:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hduser/compile/0402ad4b5cf7980040264af35de406cb/mysort.jar
14/06/11 18:28:07 INFO mapreduce.ExportJobBase: Beginning export of mysort
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hbase/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/06/11 18:28:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/06/11 18:28:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/06/11 18:28:23 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/06/11 18:28:23 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
14/06/11 18:28:23 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/06/11 18:28:23 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/06/11 18:28:24 INFO input.FileInputFormat: Total input paths to process : 1
14/06/11 18:28:24 INFO input.FileInputFormat: Total input paths to process : 1
14/06/11 18:28:25 INFO mapreduce.JobSubmitter: number of splits:1
14/06/11 18:28:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402488523460_0003
14/06/11 18:28:25 INFO impl.YarnClientImpl: Submitted application application_1402488523460_0003
14/06/11 18:28:25 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1402488523460_0003/
14/06/11 18:28:25 INFO mapreduce.Job: Running job: job_1402488523460_0003
14/06/11 18:28:46 INFO mapreduce.Job: Job job_1402488523460_0003 running in uber mode : false
14/06/11 18:28:46 INFO mapreduce.Job:  map 0% reduce 0%
14/06/11 18:29:04 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_0, Status : FAILED
Error: java.io.IOException: Can't export data, please check task tracker logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
    at java.util.ArrayList$Itr.next(ArrayList.java:839)
    at mysort.__loadFromFields(mysort.java:198)
    at mysort.parse(mysort.java:147)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more

14/06/11 18:29:23 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_1, Status : FAILED
Error: java.io.IOException: Can't export data, please check task tracker logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
    at java.util.ArrayList$Itr.next(ArrayList.java:839)
    at mysort.__loadFromFields(mysort.java:198)
    at mysort.parse(mysort.java:147)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more

14/06/11 18:29:42 INFO mapreduce.Job: Task Id : attempt_1402488523460_0003_m_000000_2, Status : FAILED
Error: java.io.IOException: Can't export data, please check task tracker logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
    at java.util.ArrayList$Itr.next(ArrayList.java:839)
    at mysort.__loadFromFields(mysort.java:198)
    at mysort.parse(mysort.java:147)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more

14/06/11 18:30:03 INFO mapreduce.Job:  map 100% reduce 0%
14/06/11 18:30:03 INFO mapreduce.Job: Job job_1402488523460_0003 failed with state FAILED due to: Task failed task_1402488523460_0003_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

14/06/11 18:30:03 INFO mapreduce.Job: Counters: 9
    Job Counters 
        Failed map tasks=4
        Launched map tasks=4
        Other local map tasks=3
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=69336
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=69336
        Total vcore-seconds taken by all map tasks=69336
        Total megabyte-seconds taken by all map tasks=71000064
14/06/11 18:30:03 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
14/06/11 18:30:03 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 100.1476 seconds (0 bytes/sec)
14/06/11 18:30:03 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
14/06/11 18:30:03 INFO mapreduce.ExportJobBase: Exported 0 records.
14/06/11 18:30:03 ERROR tool.ExportTool: Error during export: Export job failed!

这是日志文件:

2014-06-11 17:54:37,601 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2014-06-11 17:54:37,602 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2014-06-11 17:54:52,678 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-06-11 17:54:52,777 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-06-11 17:54:52,846 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-06-11 17:54:52,847 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2014-06-11 17:54:52,855 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2014-06-11 17:54:52,855 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1402488523460_0002, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@971d0d8)
2014-06-11 17:54:52,901 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2014-06-11 17:54:53,165 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /tmp/hadoop-hduser/nm-local-dir/usercache/hduser/appcache/application_1402488523460_0002
2014-06-11 17:54:53,249 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2014-06-11 17:54:53,249 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2014-06-11 17:54:53,393 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2014-06-11 17:54:53,689 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
2014-06-11 17:54:53,899 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: Paths:/user/hduser/MySort/input/data.txt:0+891082
2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file
2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start
2014-06-11 17:54:53,904 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception raised during data export
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 
2014-06-11 17:54:54,028 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception: 
java.util.NoSuchElementException
    at java.util.ArrayList$Itr.next(ArrayList.java:839)
    at mysort.__loadFromFields(mysort.java:198)
    at mysort.parse(mysort.java:147)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2014-06-11 17:54:54,030 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: On input: ustNU 45
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: On input file: hdfs://localhost:9000/user/hduser/MySort/input/data.txt
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: At position 0
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Currently processing split:
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Paths:/user/hduser/MySort/input/data.txt:0+891082
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: This issue might not necessarily be caused by current input
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: due to the batching nature of export.
2014-06-11 17:54:54,031 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: 
2014-06-11 17:54:54,032 INFO [Thread-12] org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
2014-06-11 17:54:54,033 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Can't export data, please check task tracker logs
2014-06-11 17:54:54,033 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Can't export data, please check task tracker logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.util.NoSuchElementException
    at java.util.ArrayList$Itr.next(ArrayList.java:839)
    at mysort.__loadFromFields(mysort.java:198)
    at mysort.parse(mysort.java:147)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more

2014-06-11 17:54:54,037 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task

感谢您解决问题的任何帮助。

1 个答案:

答案 0 :(得分:1)

以下是Sqoop的安装和导入和导出命令的完整过程。希望完全可能对某些人有所帮助。这个由我试用和测试,实际上有效。

Download : apache.mirrors.tds.net/sqoop/1.4.4/sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz

sudo mv sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz /usr/lib/sqoop

copy paste followingtwo lines in .bashrc
export SQOOP_HOME=/usr/lib/sqoop
export PATH=$PATH:$SQOOP_HOME/bin

Go to /usr/lib/sqoop/conf folder and copy sqoop-env-template.sh to new file sqoop-env.sh and modify export HADOOP_HOME ,HBASE_HOME,etc to the installation directory

Download the postgresql conector jar file from jdbc.postgresql.org/download/postgresql-9.3-1101.jdbc41.jar

create a directory manager.d in sqoop/conf/

create a file postgresql in conf/  and add the following line in it
org.postgresql.Driver=/usr/lib/sqoop/lib/postgresql-9.3-1101.jdbc41.jar

name the connector.jar file accordingly 

出口

Create a user in postgres:
createuser -P -s -e ace
Enter password for new role: ace
Enter it again: ace


CREATE DATABASE testdb OWNER ace TABLESPACE ace;

create table stud1(id int,name text);

Create a file student.txt
Add lines such as:
1,Ace
2,iloveapis

hadoop fs -put student.txt 

sqoop export --connect jdbc:postgresql://localhost:5432/testdb --username ace --password ace --table stud1 -m 1 --export-dir student.txt

check in postgres: Select * from stud1;

导入:

sqoop import --connect jdbc:postgresql://localhost:5432/testdb --username akshay --password akshay --table stud1 --m 1

hadoop fs -ls -R stud1

预期产出:

-rw-r--r--   1 hduser supergroup          0 2014-06-13 18:10 stud1/_SUCCESS
-rw-r--r--   1 hduser supergroup         21 2014-06-13 18:10 stud1/part-m-00000


hadoop fs -cat stud1/part-m-00000

预期产出:

1,Ace
2,iloveapis


hadoop fs -copyToLocal stud1/part-m-00000 $HOME/imported_data.txt