Scala代码转换为Pyspark-Stanford CoreNLP

时间:2019-05-20 06:51:35

标签: scala pyspark stanford-nlp ner

请将以下代码转换为PySpark。无论我如何尝试,我都会不断收到Py4JJavaError。 否则,请分享在PySpark中实施Stanford CoreNLP NER的链接。

import org.apache.spark.sql.functions._
import com.databricks.spark.corenlp.functions._

val input= Seq((1, "<xml>Stanford University is located in California. It is a great university.</xml>")).toDF("id", "text")
val output= input.select(cleanxml('text).as('doc)).select(explode(ssplit('doc)).as('sen)).select('sen, tokenize('sen).as('words), ner('sen).as('nerTags),sentiment('sen).as('sentiment))
output.show(truncate = false)

情感分析不是我的优先事项,NER是。

0 个答案:

没有答案