在C#中开发Apache Storm拓扑时,如何在Bolts和Javabolts之间传递数据

时间:2015-05-08 17:18:27

标签: c# azure apache-storm hdinsight

我们正在开展一个项目,我们在HDInsight上使用Storm来分析实时数据。我们使用事件中心作为输入和输出,我们在通过拓扑传递数据时遇到一些问题。我们目前有一个JavaSpout作为输入处理程序,一个定制Bolt(Bolt1)假设对数据进行一些分析,一个JavaBolt假设采取分析的数据并将其发送到输出事件中心。通过JavaSpout和JavaBolts传递数据就像一个魅力,但是当我们生成自定义Bolt时,数据被封装或者其他东西,它没有显示它应该是什么。输出应该显示一个JSON字符串,但显示一些随机的东西,如:[B @ 6d645e45

大部分内容来自本教程的代码:http://azure.microsoft.com/sv-se/documentation/articles/hdinsight-storm-develop-csharp-event-hub-topology/

这是我们的拓扑构建器:

TopologyBuilder topologyBuilder = new TopologyBuilder("EventHubReaderTest");

        int partitionCount = Properties.Settings.Default.EventHubPartitionCount;

        JavaComponentConstructor constructor = JavaComponentConstructor.CreateFromClojureExpr(
            String.Format(@"(com.microsoft.eventhubs.spout.EventHubSpout. (com.microsoft.eventhubs.spout.EventHubSpoutConfig. " +
                @"""{0}"" ""{1}"" ""{2}"" ""{3}"" {4} ""{5}""))",
                Properties.Settings.Default.EventHubPolicyName,
                Properties.Settings.Default.EventHubPolicyKey,
                Properties.Settings.Default.EventHubNamespace,
                Properties.Settings.Default.EventHubNameInput,
                partitionCount,
                ""));

        topologyBuilder.SetJavaSpout(
            "EventHubSpout",
            constructor,
            partitionCount);

        List<string> javaSerializerInfo = new List<string>() { "microsoft.scp.storm.multilang.CustomizedInteropJSONSerializer" };

        topologyBuilder.SetBolt(
            "bolten",
            Bolt1.Get,
            new Dictionary<string, List<string>>()
            {
                {Constants.DEFAULT_STREAM_ID, new List<string>(){"Event"}}
            },
            partitionCount).
            DeclareCustomizedJavaSerializer(javaSerializerInfo).
            shuffleGrouping("EventHubSpout");

        JavaComponentConstructor constructorout =
            JavaComponentConstructor.CreateFromClojureExpr(
            String.Format(@"(com.microsoft.eventhubs.bolt.EventHubBolt. (com.microsoft.eventhubs.bolt.EventHubBoltConfig. " +
            @"""{0}"" ""{1}"" ""{2}"" ""{3}"" ""{4}"" {5}))",
            Properties.Settings.Default.EventHubPolicyName,
            Properties.Settings.Default.EventHubPolicyKey,
            Properties.Settings.Default.EventHubNamespace,
            "servicebus.windows.net", //suffix for servicebus fqdn
            Properties.Settings.Default.EventHubNameOutput,
            "true"));

        topologyBuilder.SetJavaBolt(
            "EventHubBolt",
            constructorout,
            partitionCount).
            shuffleGrouping("bolten");

        return topologyBuilder;

这是Bolt,假设做一些工作

public Bolt1(Context ctx)
    {
        this.ctx = ctx;

        Dictionary<string, List<Type>> inputSchema = new Dictionary<string, List<Type>>();
        inputSchema.Add("default", new List<Type>() { typeof(string) });

        Dictionary<string, List<Type>> outputSchema = new Dictionary<string, List<Type>>();

        outputSchema.Add("default", new List<Type>() { typeof(string) });

        this.ctx.DeclareComponentSchema(new ComponentStreamSchema(inputSchema, outputSchema));
        this.ctx.DeclareCustomizedDeserializer(new CustomizedInteropJSONDeserializer());

    }

    public static Bolt1 Get(Context ctx, Dictionary<string, Object> parms)
    {
        return new Bolt1(ctx);
    }
    //this is there the magic should happen
    public void Execute(SCPTuple tuple)
    {
        string test = "something";

        //we are currently just trying to emit a string
        ctx.Emit(new Values(test));


    }

我们希望我们能够很好地解释这个问题,我们不太了解拓扑结构是如何工作的,因此难以排除故障。

修改 我们通过在拓扑中声明解串器来解决它: 列出javaSerializerInfo = new List(){&#34; microsoft.scp.storm.multilang.CustomizedInteropJSONSerializer&#34; };             列出javaDeserializerInfo = new List(){&#34; microsoft.scp.storm.multilang.CustomizedInteropJSONDeserializer&#34;,&#34; java.lang.String&#34; };

        topologyBuilder.SetBolt(
            "bolten",
            Bolt1.Get,
            new Dictionary<string, List<string>>()
            {
                {Constants.DEFAULT_STREAM_ID, new List<string>(){"Event"}}
            },
            partitionCount).
            DeclareCustomizedJavaSerializer(javaSerializerInfo).
            DeclareCustomizedJavaDeserializer(javaDeserializerInfo).
            shuffleGrouping("EventHubSpout");

在自定义C#bolt中,我们声明了一个序列化器:

this.ctx.DeclareComponentSchema(new ComponentStreamSchema(inputSchema, outputSchema));
        this.ctx.DeclareCustomizedDeserializer(new CustomizedInteropJSONDeserializer());
        this.ctx.DeclareCustomizedSerializer(new CustomizedInteropJSONSerializer());

0 个答案:

没有答案