我在AWS上使用Zeppelin启动了一个全新的AWS EMR Spark群集来查询MYSQL数据库。当我尝试在Zeppelin中添加MYSQL解释器时,该选项不存在。我用谷歌搜索找到一种方法来显示解释器,但我没有找到解决方案。如何在Zeppelin中获取MYSQL解释器,以便查询MYSQL数据库?
答案 0 :(得分:4)
Spark SQL支持SQL:2003
和SQL:2011
[1] [2] 的许多功能,您可以考虑通过Spark实现这一功能Zeppelin通过添加依赖。
您现在应该能够访问MySQL表。以下是使用Scala API的示例:
/* Database Configuration*/
val jdbcURL = s"jdbc:mysql://${HOST}/${DATABASE}"
val jdbcUsername = s"${USERNAME}"
val jdbcPassword = s"${PASSWORD}"
import java.util.Properties
val connectionProperties = new Properties()
connectionProperties.put("user", jdbcUsername)
connectionProperties.put("password", jdbcPassword)
connectionProperties.put("driver", "com.mysql.cj.jdbc.Driver")
/* Read Data from MySQL */
val desiredData = spark.read.jdbc(jdbcURL, "${TABLE NAME}", connectionProperties)
desiredData.printSchema
/* Data Manipulation */
desiredData.createOrReplaceTempView("desiredData")
val query = s"""
SELECT COUNT(*) AS `Record Number`
FROM desiredData
"""
spark.sql(query).show
val query2 = s"""
SELECT ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column1, column2) AS column3
FROM desiredData
"""
spark.sql(query2).show
.
.
.