如何在计数器更新之前获得先前的状态

时间:2013-12-04 12:52:32

标签: apache-storm trident

我有一些批量为5的元组,其中包含来自用户的展示次数:

Batch 1:
[UUID1, clientId1]
[UUID2, clientId1]
[UUID2, clientId1]
[UUID2, clientId1]
[UUID3, clientId2]

Batch 2:
[UUID4, clientId1]
[UUID5, clientId1]
[UUID5, clientId1]
[UUID6, clientId2]
[UUID6, clientId2]

这是我保存计数状态的例子:

TridentState ClientState = impressionStream
    .groupBy(new Fields("clientId"))
    .persistentAggregate(getCassandraStateFactory("users", "DataComputation",
        "UserImpressionCounter"), new Count(), new Fields("count));

Stream ClientStream = ClientState.newValuesStream();

我有清晰的数据库并运行我的拓扑。在通过clientId对流进行分组后,我使用persistentAggregate函数和Count聚合器保存状态。 对于第一批是newValuesStream方法之后的结果:[clientId1, 4][clientId2, 1]。 对于第二批:[clientId1, 7][clientId2, 3]按预期方式。

ClientStream用于几个分支和一个分支 这些分支我需要处理元组,以便批量为1,因为我需要有关每个的计数信息 元组。 大小为1的批处理显然是垃圾,所以在更新它并发出之前,我必须以某种方式找出计数器的先前状态 这个信息与元组有已更新的计数器,例如第二批[clientId1, 7, 4]

有人知道怎么做吗?

1 个答案:

答案 0 :(得分:0)

我已经通过添加新的聚合器并使用持久聚合连接解决了这个问题:

TridentState ClientState = impressionStream
    .groupBy(new Fields("clientId"))
    .persistentAggregate(getCassandraStateFactory("users", "DataComputation",
        "UserImpressionCounter"), new Count(), new Fields("count));

Stream ClientBatchAggregationStream = impressionStream
    .groupBy(new Fields("clientId"))
    .aggregate(new SumCountAggregator(), new Fields("batchCount"));

Stream GroupingPeriodCounterStateStream = topology
    .join(ClientState.newValuesStream(), new Fields("clientId"),
        ClientBatchAggregationStream, new Fields("clientId"), 
        new Fields("clientId", "count", "batchCount"));

SumCountAggregator:

public class SumCountAggregator extends BaseAggregator<SumCountAggregator.CountState> {

    static class CountState {
        long count = 0;
    }

    @Override
    public CountState init(Object batchId, TridentCollector collector) {
        return new CountState();
    }

    @Override
    public void aggregate(CountState state, TridentTuple tuple, TridentCollector collector)            {
        state.count += 1;
    }

    @Override
    public void complete(CountState state, TridentCollector collector) {
        collector.emit(new Values(state.count));
    }

}