Question

我的目标是生成一个单词词典，在过去的三年中使用不同的FreqDist键，但出现时间最晚。

我已经生成了一个字典，其中的键引用了日期，以及对应于该月中提取的FreqDist的值。

{'20151': FreqDist({'physiotherapy': 11, 'claimant': 5, 'rehabilitation': 4, 'agent': 3, 'assessment': 3, 'client': 2, 'via': 1, 'jigsaw': 1, 'ticc': 1, 'accupuncture': 1, ...})}
{'20152': FreqDist({'physiotherapy': 12, 'rehabilitation': 7, 'assessment': 4, 'treatment': 4, 'claimant': 3, 'ltd': 3, 'appointment': 2, 'provider': 2, 'medical': 2, 'service': 2, ...})}

...

{'20184': FreqDist({'physiotherapy': 10, 'rehabilitation': 9, 'client': 8, 'claimant': 6, 'assessment': 5, 'ticc': 5, 'agent': 3, 'treatment': 3, 'symptom': 3, 'ltd': 3, ...})}
{'20185': FreqDist({'rehabilitation': 21, 'physiotherapy': 15, 'client': 9, 'assessment': 7, 'ticc': 6, 'agent': 6, 'detail': 5, 'ltd': 4, 'arrangement': 3, 'simply': 3, ...})}.

然后我将能够通过

从那些FreqDist获得不同的值

Rehab_Noun_list.append((FreqDist))
list(dict.fromkeys(list(itertools.chain.from_iterable(Rehab_Noun_list))))

想知道在给定月份的情况下，我将如何报告那些独特的FreqDist密钥的最新出现？

Answer 1

使用熊猫：

import pandas as pd
from collections import defaultdict

ser = pd.Series([{'physiotherapy':10,'rehabilitation':9},
                 {'rehabilitation':21,'physiotherapy':15},
                 {'physiotherapy':12}])

count = defaultdict(int)

for d in ser:
    for key in d:
        count[key] += 1

print(count)

或：

ser.apply(pd.Series).count().to_dict()
Output: {'physiotherapy': 3, 'rehabilitation': 2}

Python-从一系列FreqDist中获取最新出现的FreqDist密钥

1 个答案: