Question

我想找到数组sorted_array的不同值的计数。找到不同的值并将它们分配给distinct_values数组后，我想在distinct_values_count数组上为相同的位置分配值的计数，但我的代码似乎不起作用。 output.txt文件看起来像这样：

> sorted_array = [] 
> distinct_values = []
> distinct_values_count = [0]
> file = open('output.txt', 'r')
> 
> for line in file:
>     sorted_array.append(line.split('\n'))
> 
> sorted_array.sort()
> 
> for i in range(0, len(sorted_array)):
>     year = sorted_array[i][0]
>     if year not in distinct_values:
>         distinct_values.append(year)
>     if year in distinct_values:
>         pos = distinct_values.index(year)
>         distinct_values_count[pos] = sorted_array.count(year)
> 
> file.close()

我收到此错误：

IndexError：列表分配索引超出范围

Answer 1

您正在做的许多事情都可以在Python中更轻松地完成。处理平面列表与使用line.split('\n')拆分每一行创建的两级列表相比，肯定更容易。你有这样做的原因吗？

创建文件内容的排序数组：

with open('/tmp/file') as f:
    sorted_array=sorted(line.strip() for line in f)

要获取不同的值，请使用集合：

distinct_values=set(sorted_array)

获取不同值的计数：

distinct_value_count=[(e, sorted_array.count(e)) for e in distinct_values]

如果你想要那个排序：

distinct_value_count=sorted((e, sorted_array.count(e)) for e in distinct_values)

然后：

>>> sorted_array
['1977', '1978', '1982', '1983', '1983', '1987', '1988', '1996', '2006', '2011', '2012', '2013']
>>> distinct_values
{'2011', '2006', '1996', '1978', '1977', '1987', '1983', '2012', '1982', '1988', '2013'}
>>> distinct_value_count
[('1977', 1), ('1978', 1), ('1982', 1), ('1983', 2), ('1987', 1), ('1988', 1), ('1996', 1), ('2006', 1), ('2011', 1), ('2012', 1), ('2013', 1)]

或者，使用字典来替换创建单独的set和count的需要，因为dict的键也是唯一的（但是无序的）：

>>> dv_dict={k:sorted_array.count(k) for k in set(sorted_array)}
>>> dv_dict
{'1996': 1, '2011': 1, '1978': 1, '1988': 1, '2013': 1, '1977': 1, '1987': 1, '2006': 1, '2012': 1, '1983': 2, '1982': 1}

查找数组的不同值的计数并将它们分配给另一个数组

1 个答案: