以下代码产生以下错误:
错误:标记输入时发生意外错误 以下回溯可能已损坏或无效 错误消息是:('EOF in multi-line string',(1,23))
这在Azure Databricks中的Spark群集上运行。这很奇怪,因为它可以在我的Linux Spark集群上正常工作。
import sys
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
spark.conf.set("dfs.adls.oauth2.access.token.provider.type", "ClientCredential")
spark.conf.set("dfs.adls.oauth2.client.id", "dc4f6b60-f13f-4af2-936d-04e16cd12642")
spark.conf.set("dfs.adls.oauth2.credential", "JTzYVPJnwwaHd5axSBFIf0LDgr5ed9E5CfEsCWvgj7I=")
spark.conf.set("dfs.adls.oauth2.refresh.url", "https://login.microsoftonline.com/1eea4c39-2312-4936-80b7-99b419d587e5/oauth2/token")
df1 = spark.read.json("adl://carlslake.azuredatalakestore.net/jfolder2/Data_Country.json")
df2 = spark.read.json("adl://carlslake.azuredatalakestore.net/jfolder2/Data_Customer.json")
df3 = spark.read.json("adl://carlslake.azuredatalakestore.net/jfolder2/Data_Sales.json")
df4 = spark.read.json("adl://carlslake.azuredatalakestore.net/jfolder2/Data_SalesDetails.json")
df5 = spark.read.json("adl://carlslake.azuredatalakestore.net/jfolder2/Data_Stock.json")
df1.createOrReplaceTempView('Data_Country')
df2.createOrReplaceTempView('Data_Customer')
df3.createOrReplaceTempView('Data_Sales')
df4.createOrReplaceTempView('Data_SalesDetails')
df5.createOrReplaceTempView('Data_Stock')
#This is taken from 'By MySelf Adding .. in dbforge this shows how to do joins
example1 = spark.sql("""SELECT
CF.CountryName AS CountrySold
,COUNT(CF.CountryName) AS soldincountry
,MAX(CB.SalesDetailsID) AS CarsSold
FROM Data_Stock CS
INNER JOIN Data_SalesDetails CB
ON CS.StockCode = CB.StockID
INNER JOIN Data_Sales CD
ON CB.SalesID = CD.SalesID
INNER JOIN Data_Customer CG
ON CD.CustomerID = CG.CustomerID
INNER JOIN Data_Country CF
ON CG.Country = CF.CountryISO2
GROUP BY CF.CountryName""")
有什么想法我可能会出错吗?