如何将HDFS(Hive)中的json对象导入Mysql

时间:2014-02-12 18:03:13

标签: hive hdfs sqoop

我尝试了sqoop,但它给出了Text mapper错误。

json对象包含嵌套字段。如何从hive(HDFS)将此数据导出到MYSQL?

{"Numberoffollowers":"77","Description":"A.C. Telezone - Tata Photon Plus in Delhi & NCR\n\n\nA.C. Telezone believe in providing the best suited plan for Tata Photon Plus in Delhi & NCR, after understanding the clients requirement &  getting the best time to time offers to our clients.\n\nTata Photon Plus in Delhi & NCR is a High Speed Internet Access Service in the form of a USB Modem offered by Tata Teleservices Ltd.\n\nTata Photon+ in Delhi/NCR the next generation technology offers a great mobile internet connectivity solution. Tata Photon Plus now gives access to the internet at never before speeds.\n\nFree Home Delivery -  Tata  Photon Plus in Delhi & Ncr\n\nA C Telezone is a wholesale dealer offering Tata Photon Plus in Delhi & NCR.\n\nTata Photon Plus in Delhi & NCR for Desktop and Laptop.\nStay connected wherever you go with Tata Photon Plus in Delhi & NCR.\n\nAdvantage of using the Tata Photon Plus in Delhi & NCR:\n\n\u2022Enjoy Superior Indoor Connectivity\n\u2022Free Roaming anywhere in India\n\u2022Dedicated Data Carrier\n\u2022Enhanced Signal Reception\n\u2022Just 3 Mouse Clicks Simple Activation process\n\u2022Photon Plus Coverage now in Delhi & NCR\n\u2022Photon Care \u2013 Dedicated Photon Call Centre\n\u2022100% Cash Back Offer\n\nFree Delivery on call, Tata Photon Plus in Delhi & NCR\n\nTata Photon Plus Offers Affordable plans to suit your needs for everyone in Delhi & NCR.\n\nFor Special Offers and Free Home Delivery of Tata Photon Plus in Delhi & NCR\n\nContact:\nShakti Kalra\n9210450000\nA.C.Telezone\nCSP/Franchisee- TTSL\nNew Delhi.\nhttp://www.actelezone.com/","EmployeeCountRange ":"C,11-50","Locations":[{"street2":"","regionCode":"7151","street1":"New Delhi","postalCode":"110001","state":"Delhi","countryCode":"in","city":"New Delhi"}],"WebsiteUrl":"http://www.actelezone.com/","Name":"A.C. Telezone - Tata Photon Plus in Delhi & NCR","Status":"Operating","TwitterId":"","searchName":"A & C Wholesale","Foundedyear":"2009","ContactInfo":[{"fax":"","phone2":"","phone1":"+919210450000"}],"Blog":"","EmailDomains":"[actelezone.com]","Specialities":"[Tata Photon Plus in Delhi & NCR, Tata Photon Plus in Delhi, Tata Photon Plus in Delhi & NCR - Internet Services]","Industry":"Internet"},

sqoop-export --connect jdbc:mysql://****/HIVE_DATA --username user --password user --table linkedin_source  --export-dir --username user --password user /user/hive/warehouse/linkedin_source/

the error is Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: $HADOOP_HOME is deprecated.

14/02/13 12:48:10 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/02/13 12:48:11 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/02/13 12:48:11 INFO tool.CodeGenTool: Beginning code generation
14/02/13 12:48:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `linkedin_source` AS t LIMIT 1
14/02/13 12:48:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `linkedin_source` AS t LIMIT 1
14/02/13 12:48:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /hive/Hadoop/hadoop-1.2.1
Note: /tmp/sqoop-root/compile/56ea92df7c4494a3bad2e614859f19e1/linkedin_source.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/02/13 12:48:32 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/56ea92df7c4494a3bad2e614859f19e1/linkedin_source.jar
14/02/13 12:48:33 INFO mapreduce.ExportJobBase: Beginning export of linkedin_source
14/02/13 12:48:48 INFO input.FileInputFormat: Total input paths to process : 6
14/02/13 12:48:48 INFO input.FileInputFormat: Total input paths to process : 6
14/02/13 12:48:49 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/02/13 12:48:49 WARN snappy.LoadSnappy: Snappy native library not loaded
14/02/13 12:48:51 INFO mapred.JobClient: Running job: job_201402121526_0075
14/02/13 12:48:52 INFO mapred.JobClient:  map 0% reduce 0%
14/02/13 12:59:46 INFO mapred.JobClient: Task Id : attempt_201402121526_0075_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.util.NoSuchElementException
        at java.util.ArrayList$Itr.next(ArrayList.java:834)
        at linkedin_source.__loadFromFields(linkedin_source.java:909)
        at linkedin_source.parse(linkedin_source.java:768)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more

这是我得到的错误。如何解决这个错误。这是由于HDFS中存在嵌套的json对象吗?

1 个答案:

答案 0 :(得分:0)

使用JsonSerde - JSON数据的读/写SerDe。该模块允许配置单元以JSON格式读写。