PySpark-无法找到数据源:com.databricks.spark.xml

时间:2018-06-22 13:40:09

标签: apache-spark pyspark

我正在使用Spark 2.3.0开发CentOS 7 我将PySpark与jupyter链接在一起,但是在尝试读取xml文件时出现错误
df=pyspark.SQLContext(sc).read.format('com.databricks.spark.xml').options(rowTag='books').load('xm.xml')

Py4JJavaError         Traceback (most recent call last)
<ipython-input-13-c0ea09e4b676> in <module>()
----> 1 df = pyspark.SQLContext(sc).read.format('com.databricks.spark.xml').options(rowTag='books').load('xm.xml')
/usr/lib/spark/python/pyspark/sql/readwriter.pyc in load(self, path, format, 
schema, **options)
164         self.options(**options)
165         if isinstance(path, basestring):
--> 166             return self._df(self._jreader.load(path))
167         elif path is not None:
168             if type(path) != list:

  ...



  Py4JJavaError: An error occurred while calling o61.load.
: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:635)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:190)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
 Caused by: java.lang.ClassNotFoundException: <br>

如何添加com.databriks与我一起工作?

0 个答案:

没有答案