如何基于列表中的某个特定值将列表中的值相加?

时间:2018-12-21 05:11:09

标签: python

我有一个列表列表,每个列表都有以下项目: site, count, time sample data: site1, 15, 20 我正在尝试找出解决此问题的最佳方法。我想累加每个站点的数量和时间。

我想遍历每个列表时将其转换为字典,但是我不确定这会给我带来什么。

for site, count, time in lists: #create a dictionary, then what?

最终结果是,我想要一个列表或字典(我可以使用某种数据结构),并将每个站点的计数和时间加到每个站点的“总计”列表中。

例如: site, total_count, total_time

sample data: site1, 50, 100 #all data for site1 added up site2, 40, 300 #all data for site2 added up

不是在寻找编码答案,而只是寻找最佳答案的正确方法和正确的方向。

5 个答案:

答案 0 :(得分:1)

您说过某种数据结构,所以也许从您拥有的列表中构造一个DataFrame,然后使用groupby后跟sum来获得所需的内容。

示例

import pandas as pd
data = [['site1',15,20],['site1',35,80],['site2',15,20]]
df = pd.DataFrame(data,columns=['site','time','count'])
print(df.groupby('site').sum())

输出

       time  count
site              
site1    50    100
site2    15     20

或者

data = [['site1',15,20],['site1',35,80],['site2',15,20]]
data_d = {}
for rec in data:
    if rec[0] in data_d:
        data_d[rec[0]][0] += rec[1]
        data_d[rec[0]][1] += rec[2]
    else:
        data_d[rec[0]] = rec[1:]

答案 1 :(得分:0)

您可以遍历列表列表(最好将其改为元组列表),然后将计数和时间添加到输出字典的总计数和总时间中,并以site为键:

lists = [
    ('site1', 15, 20),
    ('site2', 10, 30),
    ('site1', 5, 25),
    ('site1', 30, 55),
    ('site2', 30, 270)
]
result = {}
for site, count, time in lists:
    total_count, total_time = result.get(site, (0, 0))
    result[site] = (total_count + count, total_time + time)

result变为:

{'site1': (50, 100), 'site2': (40, 300)}

答案 2 :(得分:0)

这个问题仍然有点模棱两可,但是例如,您可以构建一个使用词典字典的类。通过添加数据,它可以以迭代方式聚合数据:

>>> class SiteAggregator:
...     def __init__(self):
...             self.sites = {}
...     def __call__(self, data):
...             site_name, site_counts, site_time = data
...             if site_name not in self.sites:
...                     self.sites[site_name] = {'counts':0, 'time':0}
...             self.sites[site_name]['counts'] += site_counts
...             self.sites[site_name]['time'] += site_time
...
>>> site_agg = SiteAggregator()
>>> site_agg(['a', 20, 22])
>>> site_agg(['b', 10, 13])
>>> site_agg.sites['a']
{'counts': 20, 'time': 22}
>>> site_agg(['a', 10, 12])
>>> site_agg.sites['a']
{'counts': 30, 'time': 34}
>>> sites = [['a', 20, 10], ['b', 30, 15], ['c', 18, 22], ['a', 15, 22], ['b', 10, 2]]
>>> for site in sites:
...     site_agg(site)
...
>>> site_agg.sites['a']
{'counts': 65, 'time': 66}

答案 3 :(得分:0)

我认为,以下是解决该问题的正确方法。

import json # For pretty priting dictionary

# List of lists where each sub list contains site, count, time in order
data_list = [
    ["mysite1.com", 11, 88],
    ["mysite1.com", 7, 6],
    ["google.com", 6, 23],
    ["mysite2.com", 9, 12],
    ["google.com", 4, 7],
    ['mysite1.com', 9, 12],
    ['mysite2.com', 13, 4]
];

d = {}

for l in data_list:
    site, count, time = l # Unpacking

    if site in d:
        # APPEND/UPDATE VALUES
        d[site]["count"].append(count)
        d[site]["time"].append(time)
    else:
        # CREATE NEW KEYS WITH DATA
        d[site] = {
            "count": [count],
            "time": [time]
        }

    d[site]["total_count"] = sum(d[site]["count"])
    d[site]["total_time"] = sum(d[site]["time"])

print(json.dumps(d, indent=4))

# {
#     "mysite1.com": {
#         "count": [
#             11,
#             7,
#             9
#         ],
#         "time": [
#             88,
#             6,
#             12
#         ],
#         "total_count": 27,
#         "total_time": 106
#     },
#     "google.com": {
#         "count": [
#             6,
#             4
#         ],
#         "time": [
#             23,
#             7
#         ],
#         "total_count": 10,
#         "total_time": 30
#     },
#     "mysite2.com": {
#         "count": [
#             9,
#             13
#         ],
#         "time": [
#             12,
#             4
#         ],
#         "total_count": 22,
#         "total_time": 16
#     }
# }

答案 4 :(得分:0)

这是一种骇人听闻的方法(受电气工程学启发):使用其值为复数的计数器;实部是时间,虚部是计数。 ;-)