需要帮助从具有时间线和计数的字典列表中提取数据

时间:2021-07-13 05:19:51

标签: python pandas list dictionary datetime

嗨,我想从 Python 的字典列表中计算一些数据,其中数据如下所示。当 Name="TOM" 我想要城市、国家关键和国家的数量。

if (n1 == m2):
    print ("matrix multiplication is possible")
    
else:
    print ("matrix multiplication is not possible")

期待以下格式的输出,

姓名:12

汤姆:5

城市:3

带时间线的数据计数,

城市:纽约

1 月 1 日、2 月 2 日、3 月 2 日

城市:伦敦

1 月 2 日、2 月 1 日、3 月 4 日

people = [
    {"name": "Tom", "age": 10, "city": "NewYork", "Date":2021-01-04 08:37:19Z},
    {"name": "Mark", "age": 5, "country": "Japan", "Date": 2021-01-06 08:37:24Z},
    {"name": "Pam", "age": 7, "city": "London", "Date": 2021-01-04 09:26:38Z},
    {"name": "Tom", "hight": 163, "city": "California", "Date": 2021-01-08 12:50:17Z},
    {"name": "Lena", "weight": 45, "country": "Italy", "Date": 2021-01-08 12:50:17Z},
    {"name": "Ben", "age": 17, "city": "Colombo", "Date": 2021-01-21 09:56:04Z},
    {"name": "Lena", "gender": "Female", "country": "Italy", "Date": 2021-01-21 09:56:04Z},
    {"name": "Ben", "gender": "Male", "city": "Colombo", "Date": 2021-02-09 08:47:26Z},
    {"name": "Tom", "age": 10, "country": "Italy", "Date": 2021-02-09 08:47:26Z},
    {"name": "Mark", "age": 5, "country": "Japan", "Date": 2021-02-23 09:10:59Z},
    {"name": "Tom", "age": 7, "city": "London", "Date": 2021-03-08 09:39:28Z},
    {"name": "Tom", "hight": 163, "country": "Japan", "Date": 2021-03-08 09:39:28Z},
]

1 个答案:

答案 0 :(得分:0)

这只是一步一步思考“我有什么”和“我需要什么”的问题。

people = [
    {"name": "Tom", "age": 10, "city": "NewYork", "Date": '01/01/2021'},
    {"name": "Mark", "age": 5, "country": "Japan", "Date": '05/01/2021'},
    {"name": "Pam", "age": 7, "city": "London", "Date": '03/06/2021'},
    {"name": "Tom", "hight": 163, "city": "California", "Date": '04/06/2021'},
    {"name": "Lena", "weight": 45, "country": "Italy", "Date": '12/12/2020'},
    {"name": "Ben", "age": 17, "city": "Colombo", "Date": '11/12/2020'},
    {"name": "Lena", "gender": "Female", "country": "Italy", "Date": '8/01/2020'},
    {"name": "Ben", "gender": "Male", "city": "Colombo", "Date": '7/01/2020'},
    {"name": "Tom", "age": 10, "country": "Italy", "Date": '01/01/2021'},
    {"name": "Mark", "age": 5, "country": "Japan", "Date": '05/01/2021'},
    {"name": "Tom", "age": 7, "city": "London", "Date": '03/06/2021'},
    {"name": "Tom", "hight": 163, "country": "Japan", "Date": '04/06/2021'}
]

def groupby( fld ):
    vals = { fld: 0 }
    for row in people:
        if fld in row:
            vals[fld] += 1
            if row[fld] not in vals:
                vals[row[fld]] = 1
            else:
                vals[row[fld]] += 1
    return vals

months = ('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec')
def groupbydate( fld ):
    vals = {}
    for row in people:
        if fld in row and 'Date' in row:
            month = months[int(row['Date'].lstrip('0').split('/')[0])-1]
            if row[fld] not in vals:
                vals[row[fld]] = {}
            if month not in vals[row[fld]]:
                vals[row[fld]][month] = 1
            else:
                vals[row[fld]][month] += 1
    return vals

print( groupby( 'name' ) )
print( groupby( 'city' ) )
print( groupby( 'country' ) )
print( )
print( groupbydate( 'city' ) )

输出:

{'name': 12, 'Tom': 5, 'Mark': 2, 'Pam': 1, 'Lena': 2, 'Ben': 2}
{'city': 6, 'NewYork': 1, 'London': 2, 'California': 1, 'Colombo': 2}
{'country': 6, 'Japan': 3, 'Italy': 3}

{'NewYork': {'Jan': 1}, 'London': {'Mar': 2}, 'California': {'Apr': 1}, 'Colombo': {'Nov': 1, 'Jul': 1}}

使用 collections.defaultdict 会使这段代码更短:

from collections import defaultdict

def groupby( fld ):
    vals = defaultdict(int)
    for row in people:
        if fld in row:
            vals[fld] += 1
            vals[row[fld]] += 1
    return dict(vals)

months = ('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec')
def groupbydate( fld ):
    vals = {}
    for row in people:
        if fld in row and 'Date' in row:
            if row[fld] not in vals:
                vals[row[fld]] = defaultdict(int)
            month = months[int(row['Date'].lstrip('0').split('/')[0])-1]
            vals[row[fld]][month] += 1
    return vals

print( groupby( 'name' ) )
print( groupby( 'city' ) )
print( groupby( 'country' ) )
print( groupbydate( 'city' ) )

跟进添加年份

def groupbyyear( fld ):
    vals = {}
    for row in people:
        if fld in row and 'Date' in row:
            if row[fld] not in vals:
                vals[row[fld]] = defaultdict(int)
            year = int(row['Date'].split('-')[0])
            vals[row[fld]][year] += 1
    return vals

print( groupby( 'name' ) )
print( groupby( 'city' ) )
print( groupby( 'country' ) )
print( groupbydate( 'city' ) )
print( groupbyyear( 'city' ) )