从mongodb读取数据到spark

时间:2017-04-17 09:11:45

标签: mongodb apache-spark apache-spark-sql

我正在尝试将mongodb中的集合作为spark数据帧读取 这就是我所做的,我正在使用eclipse scala ide,这就是我所做的

 package TestMongoDB
 import org.apache.spark.sql.SparkSession
import com.mongodb.spark.sql._
import com.mongodb.spark._
import org.bson.Document
import com.mongodb.spark.config._
object MongoDB extends App {

try {
val sparkSession = SparkSession.builder().master("local").getOrCreate()
def makeMongoURI(uri:String,database:String,collection:String) = (s"${uri}/${database}.${collection}")

val mongoURI = "mongodb://127.0.0.1:27017"
val Conf = makeMongoURI(mongoURI,"io","thing")
val readConfigintegra: ReadConfig = ReadConfig(Map("uri" -> Conf))
// Uses the ReadConfig
 val df3 = sparkSession.sqlContext.loadFromMongoDB(ReadConfig(Map("uri" -> "mongodb://127.0.0.1:27017/io.thing")))
 df3.printSchema()
      } catch {
        case t: Throwable => t.printStackTrace() // TODO: handle error
        println(t.getMessage)
      }

    }

我收到了以下错误

java.lang.IncompatibleClassChangeError: Implementing class
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(Unknown Source)
    at java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.net.URLClassLoader.defineClass(Unknown Source)
    at java.net.URLClassLoader.access$100(Unknown Source)
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(Unknown Source)
    at java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.net.URLClassLoader.defineClass(Unknown Source)
    at java.net.URLClassLoader.access$100(Unknown Source)
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at com.mongodb.MongoClientOptions$Builder.<init>(MongoClientOptions.java:758)
    at com.stratio.datasource.mongodb.config.MongodbConfig$.<init>(MongodbConfig.scala:72)
    at com.stratio.datasource.mongodb.config.MongodbConfig$.<clinit>(MongodbConfig.scala)
    at TestMongoDB.MongoDB$.delayedEndpoint$TestMongoDB$MongoDB$1(MongoDB.scala:13)

但我收到了这个错误 enter image description here

任何帮助,谢谢

错误屏幕截图 enter image description here

1 个答案:

答案 0 :(得分:0)

你也可以这样做

import com.mongodb.spark.sql._
import com.mongodb.spark._
import org.bson.Document
import com.mongodb.spark.config._

def makeMongoURI(uri:String,database:String,collection:String) = (s"${uri}/${database}.${collection}")

val mongoURI = "mongodb://000.000.000.000:27017"
val Conf = makeMongoURI(mongoURI,"DBname","collectionname")

val readConfigintegra: ReadConfig = ReadConfig(Map("uri" -> Conf))


// Uses the ReadConfig
 val df3 = sqlContext.loadFromMongoDB(ReadConfig(Map("uri" -> "mongodb://000.000.000.000:27017/DBname.collectionname"))) 

从上面的代码中,您可以阅读mongo集合并存储为Dataframe。