我目前按照以下步骤将spark RDD结果保存到mysql数据库中。
User* user = ... // Get unmanaged User,
// parsed from API
// with unmanaged parsed nested mathces and round
RLMRealm* realm = [RLMRealm defaultRealm];
[realm beginWriteTransaction];
[realm addOrUpdateObject: user];
[realm commitWriteTransaction];
有更好的方法吗?
我尝试如下,但与第一种方法相比,它是如此缓慢:
import anorm._
import java.sql.Connection
import org.apache.spark.rdd.RDD
val wordCounts: RDD[(String, Int)] = ...
def getDbConnection(dbUrl: String): Connection = {
Class.forName("com.mysql.jdbc.Driver").newInstance()
java.sql.DriverManager.getConnection(dbUrl)
}
def using[X <: {def close()}, A](resource : X)(f : X => A): A =
try { f(resource)
} finally { resource.close() }
wordCounts.map.foreachPartition(iter => {
using(getDbConnection(dbUrl)) { implicit conn =>
iter.foreach { case (word, count) =>
SQL"insert into WordCount VALUES(word, count)".executeUpdate()
}
}
})