我已经开始使用CSharp作为语言来评估Microsoft.Spark,并编写了以下简单程序:
<!-- language: lang-cs -->
// Instantiate a session
var spark = SparkSession
.Builder()
.AppName("Hello Spark!")
.GetOrCreate();
var df = spark.Read().Json(@"%SPARK_HOME%\examples\src\main\resources\people.json");
// Print schema
df.PrintSchema();
// Apply a filter and show results
df.Filter(df["age"] > 21).Show();
我已经安装: 星火2.4.1 Hadoop Winutils 阿帕奇Maven Microsoft.Spark.Worker
创建套接字时出现以下错误:
[Exception] [JvmBridge] No connection could be made because the target machine actively refused it 127.0.0.1:5567
at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
at Microsoft.Spark.Network.DefaultSocketWrapper.Connect(IPAddress remoteaddr, Int32 port, String secret)
at Microsoft.Spark.Interop.Ipc.JvmBridge.GetConnection()
at Microsoft.Spark.Interop.Ipc.JvmBridge.CallJavaMethod(Boolean isStatic, Object classNameOrJvmObjectReference, String methodName, Object[] args)
No connection could be made because the target machine actively refused it 127.0.0.1:5567
答案 0 :(得分:0)
我想我已经找到解决方案,当您使用C#创建应用程序时,需要使用以下Spark-submit命令将其提交给Spark:
spark-submit.cmd --class org.apache.spark.deploy.DotnetRunner --master local C:\github\dotnet-spark\src\scala\microsoft-spark-2.4.x\target\microsoft-spark-2.4.x-0.2.0.jar Microsoft.Spark.CSharp.Examples.exe Sql.Basic %SPARK_HOME%\examples\src\main\resources\people.json