Question

我有一个包含主题标签和频率的元组列表，例如：

[('#Example', 92002),
 ('#example', 65544)]

我希望对与元组中第一个条目具有相同字符串的条目（但是区分大小写不同）进行求和，使第一个条目在第二个条目中保持最高值。以上内容将转变为：

[('#Example', 157,546)]

到目前为止我已经尝试过了：

import operator

for hashtag in hashtag_freq_list:
    if hashtag[0].lower() not in [res_entry[0].lower() for res_entry in res]:
        entries = [entry for entry in hashtag_freq_list if hashtag[0].lower() == entry[0].lower()]
        k = max(entries,key=operator.itemgetter(1))[0]  
        v = sum([entry[1] for entry in entries])
        res.append((k,v))

我只是想知道是否可以以更优雅的方式处理这个问题？

Answer 1

我会使用字典

data = [('#example', 65544),('#Example', 92002)]

hashtable = {}

for i in data:

    # See if this thing exists regardless of casing
    if i[0].lower() not in hashtable:

        # Create a dictionary
        hashtable[i[0].lower()] = {
            'meta':'',
            'value':[]
        }

        # Copy the relevant information
        hashtable[i[0].lower()]['value'].append(i[1])
        hashtable[i[0].lower()]['meta'] = i[0]

    # If the value exists
    else:

        # Check if the number it holds is the max against 
        # what was collected so far. If so, change meta
        if i[1] > max(hashtable[i[0].lower()]['value']):
            hashtable[i[0].lower()]['meta'] = i[0]

        # Append the value regardless
        hashtable[i[0].lower()]['value'].append(i[1])

# For output purposes
myList = []

# Build the tuples
for node in hashtable:
    myList.append((hashtable[node]['meta'],sum(hashtable[node]['value'])))

# Voila!
print myList
# [('#Example', 157546)]

Python：使用区分大小写的键的元组条目列表中的Sum条目？

1 个答案: