使用Microsoft.Net库执行Spark示例时出错

时间:2019-05-13 09:01:01

标签: apache-spark

我已经开始使用CSharp作为语言来评估Microsoft.Spark,并编写了以下简单程序:

<!-- language: lang-cs -->

    // Instantiate a session
            var spark = SparkSession
                .Builder()
                .AppName("Hello Spark!")
                .GetOrCreate();

            var df = spark.Read().Json(@"%SPARK_HOME%\examples\src\main\resources\people.json");

            // Print schema
            df.PrintSchema();

            // Apply a filter and show results
            df.Filter(df["age"] > 21).Show();

我已经安装: 星火2.4.1 Hadoop Winutils 阿帕奇Maven Microsoft.Spark.Worker

创建套接字时出现以下错误:

 [Exception] [JvmBridge] No connection could be made because the target machine actively refused it 127.0.0.1:5567
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
   at Microsoft.Spark.Network.DefaultSocketWrapper.Connect(IPAddress remoteaddr, Int32 port, String secret)
   at Microsoft.Spark.Interop.Ipc.JvmBridge.GetConnection()
   at Microsoft.Spark.Interop.Ipc.JvmBridge.CallJavaMethod(Boolean isStatic, Object classNameOrJvmObjectReference, String methodName, Object[] args)

No connection could be made because the target machine actively refused it 127.0.0.1:5567

1 个答案:

答案 0 :(得分:0)

我想我已经找到解决方案,当您使用C#创建应用程序时,需要使用以下Spark-submit命令将其提交给Spark:

spark-submit.cmd --class org.apache.spark.deploy.DotnetRunner --master local C:\github\dotnet-spark\src\scala\microsoft-spark-2.4.x\target\microsoft-spark-2.4.x-0.2.0.jar Microsoft.Spark.CSharp.Examples.exe Sql.Basic %SPARK_HOME%\examples\src\main\resources\people.json