我想计算一列中行的频率,并且频率大于3,我想用这些行填充数组。
例如:
402 Sony
403 Sony
404 Sony
405 ZTE
406 ZTE
407 ZTE
408 ZTE
409 ZTE
所需数组:
[
"ZTE",
"ZTE",
"ZTE",
"ZTE",
"ZTE"]
答案 0 :(得分:0)
使用collections.Counter()
来获取发生频率。因此,假设您提供的数据来自文件test.txt
:
import collections
itemList = []
with open("test.txt", 'r') as f:
for line in f:
itemList.append(line.split()[1])
result = collections.Counter(itemList)
print(result)
#get the n most repeated values in a list of tuples:
n = 1
print(result.most_common(n))
输入
402 Sony
403 Sony
404 Sony
405 ZTE
406 ZTE
407 ZTE
408 ZTE
409 ZTE
并输出:
Counter({'ZTE': 5, 'Sony': 3})
[('ZTE', 5)]