在python中使用Mrjob查找总和最低的输出

时间:2019-04-16 02:29:01

标签: python hadoop bigdata mrjob

我需要帮助输出销售总收入(总和)最低的付款方式。现在,代码正在输出所有付款方式(Visa,万事达卡,Discover等)的总和,但是我只需要输出总金额最低的付款方式即可。

我已经收集了所有付款方式并计算了每种付款方式的总和,但是我在如何限制输出以仅显示总金额最低的付款方面遇到了麻烦。

"""
 Python script to find the total amount of sales revenue for each payment mode
 using Map-Reduce framework (mapper, combiner, and reducer functions) with mrjob package
"""
from mrjob.job import MRJob

class ModeRevenue(MRJob):
# each input lines consists of city, productCategory, price, and paymentMode

    # Initialize the count value
    count = 0

    def mapper(self, _, line):
        # create a key-value pair with key: paymentMode and value: price
        line_cols = line.split(',')
        yield line_cols[3], float(line_cols[2])

    def combiner(self, mode, counts):
        # consolidates all key-value pairs of mapper function (performed at mapper nodes)
        yield mode, sum(counts)

    def reducer(self, mode, counts):
        # final consolidation of key-value pairs at reducer nodes
        self.count += 1

        if self.count <= 5:
          yield mode, '${:,.2f}'.format(sum(counts))


if __name__ == '__main__':
    ModeRevenue.run()

我需要将输出作为收入最低的一种付款方式,而不是全部。

预期:

"Discover"      "$24,922,765.13"

实际:

"Amex"  "$25,027,699.58"
"Cash"  "$25,030,603.02"
"Discover"      "$24,922,765.13"
"MasterCard"    "$24,952,916.42"
"Visa"  "$25,121,673.01"

0 个答案:

没有答案