将元组键字典子集化为基于键的名称的字典

时间:2017-04-11 16:10:40

标签: python dictionary

元组键(key0,key1)字典(df_dict)我想要子集中有几个数据帧,每个数据帧都有 date accountNum 。我想分组df_dict并根据key0生成字典名称。

df_dict = {('100', '001'): date, accountNum, ('100', '002'): date, accountNum, 
           ('200', '001'): date, accountNum, ('200', '002'): date, accountNum}

df_dict中的数据框如下所示,

('100','001')-DataFrame ('100','002')-DataFrame  ('200','001')-DataFrame 
date        accountNum   date        accountNum   data        accountNum
2010-01-01     280       2010-02-01     150       2010-03-01     330
2010-01-02     285       2010-02-02     155       2010-03-02     335
2010-01-03     290       2010-02-03     160       2010-03-03     340

('200','002')-DataFrame
date        accountNum
2010-04-01     510
2010-04-02     515
2010-04-03     520

我期望的结果就像,

df_dict_100 = {('100', '001'): date, accountNum, ('100','002'): date, accountNum}
df_dict_200 = {('200', '001'): date, accountNum, ('200','002'): date, accountNum}

每个字典中的数据框都是,

df_dict100 
('100','001')-DataFrame ('100','002')-DataFrame   
date        accountNum   date        accountNum   
2010-01-01     280       2010-02-01     150       
2010-01-02     285       2010-02-02     155       
2010-01-03     290       2010-02-03     160    

df_dict200
('200','001')-DataFrame  ('200','002')-DataFrame
date        accountNum   date         accountNum
2010-01-01     280       2010-04-01     510
2010-01-02     285       2010-04-02     515
2010-01-03     290       2010-04-03     520  

这是我的方法,

my_list = ['100','200']
subset_dict = {k: v for k, v in df_dict.items() if k[0] in my_list}

但似乎我从df_dict获得了确切的词典。

1 个答案:

答案 0 :(得分:0)

您可以通过创建多级字典将第一个表单转换为类似第二个表单的内容。因此,您可能会df_dict_100而不是df_dict[100],而不是import pprint date, accountNum = 'date', 'accountNum' df_dict = {('100', '001'): (date, accountNum), ('100', '002'): (date, accountNum), ('200', '001'): (date, accountNum), ('200', '002'): (date, accountNum)} new_dict = dict() for key, value in df_dict.items(): new_dict.setdefault(key[0], {})[key] = value pprint.pprint(new_dict)

{'100': {('100', '001'): ('date', 'accountNum'),
         ('100', '002'): ('date', 'accountNum')},
 '200': {('200', '001'): ('date', 'accountNum'),
         ('200', '002'): ('date', 'accountNum')}}

结果是:

print(new_dict['100']['100', '001'][0])

要访问单个数据,您可以使用以下语法:

subset_dict = {
    matching_key : {
        k: v for k, v in df_dict.items() if k[0] == matching_key }
    for matching_key in set(k[0] for k in df_dict)
}

如果你更喜欢词典理解,试试这个:

df_dict_100 = { k: v for k, v in df_dict.items() if k[0] == '100' }
df_dict_200 = { k: v for k, v in df_dict.items() if k[0] == '200' }

在评论中,OP问" 我可能知道如何在一个词典中生成两个词典而不是两个词典?"这样的事情应该有效:

for

将这些放在import pprint date, accountNum = 'date', 'accountNum' df_dict = {('100', '001'): (date, accountNum), ('100', '002'): (date, accountNum), ('200', '001'): (date, accountNum), ('200', '002'): (date, accountNum)} my_list = ['100', '200'] for i in my_list: new_df_dict = { k: v for k, v in df_dict.items() if k[0] == i } pprint.pprint(new_df_dict) print("----") 循环中,这是一个完整的程序:

{('100', '001'): ('date', 'accountNum'),
 ('100', '002'): ('date', 'accountNum')}
----
{('200', '001'): ('date', 'accountNum'),
 ('200', '002'): ('date', 'accountNum')}
----

这是输出:

while (!$uscita) {write-output $?; sleep 5}