有没有一种方法可以从python中的pcollection计数总记录

时间:2019-07-07 07:59:02

标签: python apache-beam

我需要使用pcollection将事实和维BQ表联接之后收到的记录总数。

all_dim_joined_pcol = join_fact_dim_tbl_obj.join_fact_dim_using_cogbk()

我希望来自pcollection all_dim_joined_pcol的记录数

1 个答案:

答案 0 :(得分:1)

我找到了一种使用Count.Globally()对pcollection中的元素进行计数的解决方案。该函数代表类apache_beam.transforms.combiners。

counts = self.all_dim_joined_pcol | Count.Globally()
def collect(row):
             temp_list.append(row)
             print ("Count value is :" , temp_list)
             message = "Join done successfully between {}  and {} having count as {}".format(tbl1,tbl2,temp_list)


counts | "printing record count for" + fact_table_name + dimension_table_name >> beam.Map(collect)