如何在Data Science Experience上实现TensorFrames Spark包?

时间:2017-05-22 19:46:41

标签: python maven apache-spark pyspark data-science-experience

我已经能够导入包裹了:

import pixiedust
pixiedust.installPackage("databricks:tensorframes:0")

但是当我尝试一个简单的例子时:

import tensorflow as tf
import tensorframes as tfs
from pyspark.sql import Row
data = [Row(x=[float(x), float(2 * x)],
            key=str(x % 2),
            z = float(x+1)) for x in range(1, 6)]
df = spark.createDataFrame(data)
tfs.print_schema(df)

我收到以下错误:

...

Py4JJavaError: An error occurred while calling o97.loadClass.
: java.lang.NoClassDefFoundError:com.typesafe.scalalogging.slf4j.LazyLogging

...

I've looked up the issue并且似​​乎依赖关系树中存在较旧的scala-logging-slf4j工件。 如何删除此工件?删除后,我认为我可以使用PixieDust添加更新版本:

pixiedust.installPackage("https://mvnrepository.com/artifact/com.typesafe.scala-logging/scala-logging-slf4j_2.10/2.1.2") pixiedust.installPackage("https://mvnrepository.com/artifact/com.typesafe.scala-logging/scala-logging-api_2.10/2.1.2")

1 个答案:

答案 0 :(得分:1)

来自IBM支持的Charles帮助我找到了要包含的jar:

pixiedust.installPackage("http://central.maven.org/maven2/com/typesafe/scala-logging/scala-logging-slf4j_2.10/2.1.2/scala-logging-slf4j_2.10-2.1.2.jar") pixiedust.installPackage("http://central.maven.org/maven2/com/typesafe/scala-logging/scala-logging-api_2.10/2.1.2/scala-logging-api_2.10-2.1.2.jar")

这在技术上解决了第一个错误,但是TesnsorFrame仍然无法正常工作。我会发布另一个更具体的问题。