Question

我正在寻找一个好的方法（可能有一个例子）：

构建简单的应用程序以驱动和查看Spark批处理作业的结果。

所以基本上我期待：

从驱动程序/ App。
将Spark批处理作业结束时的结果保存到数据库中。
当火花工作时，需要通知驱动程序/应用程序完成。
然后，驱动程序可以显示或处理结果火花工作。

步骤1和3是我寻求指导的关键步骤。

我认为这一定以前做得很好;我希望以解决方案为基础。因此问题。

期待您的回复。

Answer 1

通过以下示例，它将帮助您构建解决方案：

#include <fstream> //for std::ifstream // std::ofstream
#include <vector> //for std::vector
#include <mutex> //for std::mutex

class MyClass
{
public:
    MyClass(int ID) : ID(ID) { }
    std::ofstream outputstream;
    std::ifstream inputstream;
    std::mutex mymutex;
private:
    int ID;
};

int main()
{
    std::vector<MyClass> MyVector;
    MyVector.push_back(MyClass(1)); //<-- Error C2280 'MyClass::MyClass(const MyClass &)': attempting to reference a deleted function

    return 0;
}

您可以使用spark-submit命令运行此程序。请关注Submitting Applications以了解如何提交。
您可以在 SparkConf sparkConf = new SparkConf().setAppName("TestRun"); JavaSparkContext ctx = new JavaSparkContext(sparkConf); JavaRDD<String> lines = ctx.textFile("file:///myFile", 1); lines.foreach(new VoidFunction<String>() { @Override public void call(String v1) throws Exception { // 2nd Point Write logic to store in database } }); // This line gets executed in driver to end the job. ctx.stop(); // Here Write you logic to process the result. 操作代码块中编写逻辑来编写代码。
批处理作业结束时，将执行停止上下文的逻辑。
您可以编写在 foreach 来电之后作业结束时需要执行的逻辑。

构建简单的应用程序来驱动和查看Spark批处理作业的结果

1 个答案: