如何在列表中存储累积输出?

时间:2011-12-22 15:37:28

标签: python list

我使用以下代码来获取文本中的字母频率:

for s in 'abcdefghijklmnopqrstuvwxyz ':
    count = 0
    for char in rawpunct.lower():
        if s == char:
            count +=1
    result = s, '%.3f' % (count*100/len(rawpunct.lower()))
    f_list.append(result)

结果是:

['0.061', '0.012', '0.017', '0.030', '0.093', '0.016', '0.016', 
'0.049', '0.050', '0.001', '0.006', '0.034', '0.018', '0.052', '0.055',
 '0.013', '0.001', '0.041', '0.050', '0.069', '0.021', '0.007', '0.017',
 '0.001', '0.013', '0.000', '0.159']

但我想存储累积频率,即创建此列表:

['0.061', '0.073', '0.100', '0.130' ............ ]

任何人都知道怎么做?

7 个答案:

答案 0 :(得分:3)

您可以使用import numpy 然后使结果成为数组results=numpy.array(result) 最后是'f_list=numpy.cumsum(results)'

答案 1 :(得分:2)

letters = 'abcdefghijklmnopqrstuvwxyz '
counts = dict.fromkeys(letters, 0)
for char in rawpunct.lower():
    try:
        counts[char] += 1
    except KeyError:
        pass
        # this character in rawpunct should not be counted!
f_list = [0]
for s in letters:
    f_list.append(f_list[-1] + counts[s])
str_list = ['{0:.3f}'.format(f) for f in f_list[1:]]

我的f_list是一个浮点数列表(用浮点数计算总和比用字符串表示法更容易计算!)。最后,我创建了str_list,这是这些浮动的字符串表示的列表。由于您不希望以零开始列表,因此最终将删除此列表(仅f_list[1:])。

如果您的输入文本很长,此解决方案会更快,因为它只读取一次。

答案 2 :(得分:2)

只是为了单行的乐趣:

original = ['0.061', '0.012', '0.017', '0.030', '0.093', '0.016', '0.016', 
'0.049', '0.050', '0.001', '0.006', '0.034', '0.018', '0.052', '0.055',
 '0.013', '0.001', '0.041', '0.050', '0.069', '0.021', '0.007', '0.017',
 '0.001', '0.013', '0.000', '0.159']

result = [sum(float(item) for item in original[0:rank+1]) for rank in xrange(len(original))]

>>> [0.061, 0.073, 0.09, 0.12, 0.213, 0.22899999999999998, 0.245, 0.294, 0.344, 0.345, 0.351, 0.385, 0.403, 0.455, 0.51, 0.523, 0.524, 0.5650000000000001, 0.6150000000000001, 0.6840000000000002, 0.7050000000000002, 0.7120000000000002, 0.7290000000000002, 0.7300000000000002, 0.7430000000000002, 0.7430000000000002, 0.9020000000000002]

答案 3 :(得分:1)

if len(f_list) == 0:
    f_list.append(result)
else:
    f_list.append(f_list[-1] + result)

答案 4 :(得分:1)

f_list = [0]
for s in 'abcdefghijklmnopqrstuvwxyz ':
    count = 0
    for char in rawpunct.lower():
        if s == char:
            count +=1
    result = s, '%.3f' % (count*100/len(rawpunct.lower()))
    f_list.append(result + f_list[-1])
 f_list = list(f_list[1:])

答案 5 :(得分:0)

我的cumsum版本,使用reduce

In [1]: x = [1,2,3]
In [2]: reduce(lambda acc, x: acc + [acc[-1] + x], x[1:], x[:1])
Out[2]: [1, 3, 6]

它也适用于空列表:

In [3]: x = []
In [4]: reduce(lambda acc, x: acc + [acc[-1] + x], x[1:], x[:1])
Out[4]: []

答案 6 :(得分:0)

我猜rawpunct是包含Text的字符串。我在提案中用文字替换了它:

from string import lowercase

text='Some arbitrary  Text with NonNSense! @#!.+-'.lower()
chmap = lowercase+' '
cooked_text = ''.join([i for i in text if i in chmap])
chdict = dict.fromkeys(chmap, 0)       #set totals-dict up 
frequencies = dict.fromkeys(chmap, 0)  #set fractions dict up

for ch in cooked_text: #toals per char
    chdict[ch] += 1

for char in chdict.keys(): #relative to text-length
    frequencies[char] = float(chdict[char]) / len(cooked_text)

frequency_list = [frequencies[char] for char in chmap]
frequency_strlist = ['%.3f' % f for f in frequency_list]
print frequency_strlist