Question

我有一个循环遍历多个文件的脚本。对于每个文件，我计算文件中特定组合的出现频率。

我使用以下代码执行此操作：

with open("%s" %files) as f:
    freqs = {}
    sortedFreqs = []

    # read lines of csv file
    for l in f.readlines():

        # some code here (not added) which fills the mutationList value

    # this dict stores how often which mutation occurs.
    freqs = Counter(mutationList)

    # same list, only sorted.
    sortedFreqs = sorted(freqs.iteritems(), key=operator.itemgetter(1), reverse=True)

所以freqs变量包含很长的条目列表。

示例：

'FAM123Ap.Y550': 1, 'SMARCB1p.D192': 1, 'CSMD3p.T1137': 3

我现在想根据第二个值对它们进行排序，这些值存储在sortedFreqs中。

示例：

'CSMD3p.T1137': 3, 'FAM123Ap.Y550': 1, 'SMARCB1p.D192': 1

这一切都很顺利，但我现在想要遍历多个文件，并将所有找到的频率加在一起。所以如果我找到了CSMD3p.T1137＆＃39;值2次，我想存储＆＃39; CSMD3p.T1137＆＃39;：5。

wanted output:
totalFreqs = 'FAM123Ap.Y550': 1, 'SMARCB1p.D192': 1, 'CSMD3p.T1137': 5, 'TRPM1p.R551': 2
totalFreqsSorted = 'CSMD3p.T1137': 5,'TRPM1p.R551': 2 'FAM123Ap.Y550': 1, 'SMARCB1p.D192': 1'

我如何＆＃34;添加＆＃34; python中字典的关键值？（如何正确归档totalFreqs和totalFreqsSorted的值）

Answer 1

对所有计数使用一个 Counter()对象，并为每个文件更新它：

freqs = Counter()

for file in files:
    with open(...) as f:
        #

        freqs.update(mutationList)

或者您只需将它们相加就可以添加计数器：

total_freqs = Counter()


for file in files:
    with open(...) as f:
        #

        freqs = Counter(mutationList)
        total_freqs += freqs

请注意，Counter()个对象已经提供反向排序的频率列表;只需使用Counter.most_common() method而不是自己排序：

sortedFreqs = freqs.most_common()

在python中添加排序字典的结果

1 个答案: