我的用例如下。我需要能够在python代码中调用java方法
来自py spark的这似乎很容易
我像这样开始py spark ./pyspark --driver-class-path /path/to/app.jar
并从pyspark shell执行此操作
from sparkjobserver.api import SparkJob, build_problems
from py4j.java_gateway import JavaGateway, java_import
class WordCountSparkJob(SparkJob):
def validate(self, context, runtime, config):
if config.get('input.strings', None):
return config.get('input.strings')
else:
return build_problems(['config input.strings not found'])
def run_job(self, context, runtime, data):
x = context._jvm.com.abc.def.App
return x.getMessage()
这很好用。
使用spark job服务器时:
我使用了WordCountSparkJob.py示例
spark {
jobserver {
jobdao = spark.jobserver.io.JobSqlDAO
}
context-settings {
python {
paths = [
"/home/xxx/SPARK/spark-1.6.0-bin-hadoop2.6/python",
"/home/xxx/.local/lib/python2.7/site-packages/pyhocon",
"/home/xxx/SPARK/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip",
"/home/xxx/SPARK/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip",
"/home/xxx/gitrepos/spark-jobserver/job-server-python/src/python /dist/spark_jobserver_python-NO_ENV-py2.7.egg"
]
}
dependent-jar-uris = ["file:///path/to/app.jar"]
}
home = /home/path/to/spark
}
我的python.conf看起来像这样
[2016-10-08 23:03:46,214] ERROR jobserver.python.PythonJob [] [akka://JobServer/user/context-supervisor/py-context] - From Python: Error while calling 'run_job'TypeError("'JavaPackage' object is not callable",)
[2016-10-08 23:03:46,226] ERROR jobserver.python.PythonJob [] [akka://JobServer/user/context-supervisor/py-context] - Python job failed with error code 4
[2016-10-08 23:03:46,228] ERROR .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/py-context] - Got Throwable
java.lang.Exception: Python job failed with error code 4
at spark.jobserver.python.PythonJob$$anonfun$1.apply(PythonJob.scala:85)
at scala.util.Try$.apply(Try.scala:161)
at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:62)
at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:13)
at spark.jobserver.JobManagerActor$$anonfun$getJobFuture$4.apply(JobManagerActor.scala:288)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-10-08 23:03:46,232] ERROR .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/py-context] - Exception from job 942727f0-dd81-445d-bc64-bd18880eb4bc:
java.lang.Exception: Python job failed with error code 4
at spark.jobserver.python.PythonJob$$anonfun$1.apply(PythonJob.scala:85)
at scala.util.Try$.apply(Try.scala:161)
at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:62)
at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:13)
at spark.jobserver.JobManagerActor$$anonfun$getJobFuture$4.apply(JobManagerActor.scala:288)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-10-08 23:03:46,232] INFO k.jobserver.JobStatusActor [] [akka://JobServer/user/context-supervisor/py-context/$a] - Job 942727f0-dd81-445d-bc64-bd18880eb4bc finished with an error
[2016-10-08 23:03:46,233] INFO r$RemoteDeadLetterActorRef [] [akka://JobServer/deadLetters] - Message [spark.jobserver.CommonMessages$JobErroredOut] from Actor[akka://JobServer/user/context-supervisor/py-context/$a#1919442151] to Actor[akka://JobServer/deadLetters] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
我收到以下错误
import string
# Create a set of all allowed characters.
# {...} is the syntax for a set literal in Python.
allowed = {",", "."}.union(string.ascii_lowercase)
# This is our starting string.
lc_tokens = 'hello, "world!"'
# Now we use list comprehension to only allow letters in our allowed set.
# The result of list comprehension is a list, so we use "".join(...) to
# turn it back into a string.
filtered = "".join([letter for letter in lc_tokens if letter in allowed])
# Our final result has everything but lowercase letters, commas, and
# periods removed.
assert filtered == "hello,world"
在python.conf文件中,我将app.jar作为dependent-jar-uris中的一个条目。 我在这里遗漏了什么
答案 0 :(得分:1)
错误"'JavaPackage' object is not callable"
可能意味着PySpark无法在其中看到您的jar或您的课程。