我尝试使用下面的Azure Powershell cmdlet将表从Azure SQL数据库导入到Hive。
$hadoopUserName = "<hadoopUserName>"
$hadoopUserPassword = "<hadoopUserPassword>"
$hadoopUserPW = ConvertTo-SecureString -String $hadoopUserPassword -AsPlainText -Force
$clusterCreds = New-Object System.Management.Automation.PSCredential($hadoopUserName,$hadoopUserPW)
$clusterName = <myClusterName>
$resourceGroupName = "<Resource_Group>"
$storageAccountName = "<Storage_Account_Name>"
$containerName = "<Storage_Container>"
$storageAccountKey = Get-AzureRmStorageAccountKey -ResourceGroupName $resourceGroupName -Name $storageAccountName | %{ $_.Key1 }
$sqlDatabaseServerName = "<AzureSQLServer>"
$sqlDatabaseLogin = "<SQLDatabase_Login>"
$sqlDatabasePassword = "<SQLDatabase_Password>"
$sqlDatabaseName = "<SQLDatabase>"
$connectionString = "jdbc:sqlserver://$sqlDatabaseServerName.database.windows.net;database=$sqlDatabaseName"
$sqoopDef = New-AzureRmHDInsightSqoopJobDefinition `
-Command "import --connect $connectionString --username $sqlDatabaseServerName --password $sqlDatabasePassword --table ContactInfo --where City='London' --hive-import --hive-table HiveDB.ContactInfo -m 1 --append"
$sqoopJob = Start-AzureRmHDInsightJob `
-ClusterName $clusterName `
-HttpCredential $clusterCreds `
-JobDefinition $sqoopDef #-Debug -Verbose
Wait-AzureRmHDInsightJob `
-ResourceGroupName $resourceGroupName `
-ClusterName $clusterName `
-HttpCredential $clusterCreds `
-JobId $sqoopJob.JobId
Write-Host "Standard Error" -BackgroundColor Green
Get-AzureRmHDInsightJobOutput `
-ResourceGroupName $resourceGroupName `
-ClusterName $clusterName `
-DefaultStorageAccountName $storageAccountName `
-DefaultStorageAccountKey $storageAccountKey `
-DefaultContainer $containerName `
-HttpCredential $clusterCreds `
-JobId $sqoopJob.JobId `
-DisplayOutputType StandardError
Write-Host "Standard Output" -BackgroundColor Green
Get-AzureRmHDInsightJobOutput `
-ResourceGroupName $resourceGroupName `
-ClusterName $clusterName `
-DefaultStorageAccountName $storageAccountName `
-DefaultStorageAccountKey $storageAccountKey `
-DefaultContainer $containerName `
-HttpCredential $clusterCreds `
-JobId $sqoopJob.JobId `
-DisplayOutputType StandardOutput
Sqoop导入失败,错误:
15/12/22 17:48:01 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver c
lass: com.microsoft.sqlserver.jdbc.SQLServerDriver
java.lang.RuntimeException: Could not load db driver class: com.microsoft.sqlserver.jdbc.SQLServerDriver
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
由于azure文档(https://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-sqoop-mac-linux/),我运行以下命令从Sqoop lib目录创建指向SQL Server JDBC驱动程序的链接。但是错误仍然存在。
sudo ln /usr/share/java/sqljdbc_4.1/enu/sqljdbc41.jar /usr/hdp/current/sqoop-client/lib/sqljdbc41.jar
当我在SSH会话中运行sqoop导入时,一切正常。
sqoop import --connect 'jdbc:sqlserver://MyServer.database.windows.net:1433;database=Mydatabase' --username <Database_Login> --password <Database_Password> --table 'ContactInfo' --where "City='London'" --hive-import --hive-table HiveDB.ContactInfo -m 1 --append
有人可以帮我解决这个问题吗?