使用Python查找文件的模式和中位数

时间:2015-03-22 02:15:55

标签: python list dictionary mode median

我在项目的代码的这一部分遇到了问题,并试图以多种方式进行模式和中位数,但都没有成功。但是,我确实需要在模式部分使用词典,因此任何建议都会非常有用。

    # Find median
    order = converted_numbers.sort()
    middle = count/2
    if middle % 2 == 0:
        median = (converted_numbers[middle - 1] + converted_numbers[middle]) / 2
    else:
        median = converted_numbers[middle]

    # Mode calculations
    number_counts = {}
    mode = 0
    freq = 0
    for i in converted_numbers:
        if i in number_counts:
            number_counts[i] += 1
        else:
            number_counts[i] = 1
    for i in number_counts:
        counts = int(number_counts[i])
        mode = max(counts)

1 个答案:

答案 0 :(得分:0)

您的代码存在一些问题:

  • list.sort()没有返回值。所以,如果你想要一个排序列表(与原始列表分开)。

    ordered_numbers = converted_numbers[:] # copy it
    ordered_numbers.sort()
    
  • 使用楼层划分,并检查count是否为偶数,而不是middle

    middle = count // 2
    if count % 2 == 0:
        median = (ordered_numbers[middle - 1] + converted_numbers[middle]) / 2
    else:
        median = ordered_numbers[middle]
    
  • 要计算模式,您可以使用python的核心库之一collections库,然后快速获取它:

    from collections import Counter
    
    counter = Counter(ordered_numbers) # no need to be sorted
    mode    = counter.most_common(1)   # returns the most commonly occuring item
    

编辑:

由于要求使用字典,我们可以简单地按如下方式编写:

number_counts = {}
for num in ordered_numbers:
    if number_counts.get(num, None):
        number_counts[num] = 1
    else:
        number_counts[num] += 1

mode = ordered_numbers[0] # set to a number already in list
mode_freq = 0
for num, freq in number_counts.items():
    if freq > mode_freq:
        mode, mode_freq = num, freq

代替常规dict,您可以使用defaultdict,这样您就不需要元素存在

from collections import defaultdict

number_counts = defaultdict(int)
for num in ordered_numbers:
    number_counts[num] += 1

mode = #.... same as above