根据常用值合并2个dicts列表

时间:2015-10-21 15:32:52

标签: python list dictionary pandas

所以我有2个dicts列表如下:

list1 = [
{'name':'john',
'gender':'male',
'grade': 'third'
},
{'name':'cathy',
'gender':'female',
'grade':'second'
},
]

list2 = [
{'name':'john',
'physics':95,
'chemistry':89
},
{'name':'cathy',
'physics':78,
'chemistry':69
},
]

我需要的输出列表如下:

final_list = [
{'name':'john',
'gender':'male',
'grade':'third'
'marks': {'physics':95, 'chemistry': 89}
},
{'name':'cathy',
'gender':'female'
'grade':'second'
'marks': {'physics':78, 'chemistry': 69}
},
]

首先我尝试迭代如下:

final_list = []
for item1 in list1:
    for item2 in list2:
        if item1['name'] == item2['name']:
            temp = dict(item_2)
            temp.pop('name')
            final_result.append(dict(name=item_1['name'], **temp))

然而,这并没有给我想要的结果..我也尝试过熊猫......有限的经验......

>>> import pandas as pd
>>> df1 = pd.DataFrame(list1)
>>> df2 = pd.DataFrame(list2)
>>> result = pd.merge(df1, df2, on=['name'])

然而,我无能为力如何将数据恢复到原始格式我需要它...任何帮助

4 个答案:

答案 0 :(得分:3)

您可以先合并两个数据帧

In [145]: df
Out[145]:
   gender   grade   name  chemistry  physics
0    male   third   john         89       95
1  female  second  cathy         69       78

看起来像,

In [146]: df['marks'] = df.apply(lambda x: [x[['chemistry', 'physics']].to_dict()], axis=1)

In [147]: df
Out[147]:
   gender   grade   name  chemistry  physics  \
0    male   third   john         89       95
1  female  second  cathy         69       78

                                  marks
0  [{u'chemistry': 89, u'physics': 95}]
1  [{u'chemistry': 69, u'physics': 78}]

然后创建一个标记列作为dict

to_dict(orient='records')

并且,使用所选数据框列的In [148]: df[['name', 'gender', 'grade', 'marks']].to_dict(orient='records') Out[148]: [{'gender': 'male', 'grade': 'third', 'marks': [{'chemistry': 89L, 'physics': 95L}], 'name': 'john'}, {'gender': 'female', 'grade': 'second', 'marks': [{'chemistry': 69L, 'physics': 78L}], 'name': 'cathy'}] 方法

<form method="post" action="signup_servlet_ml" name='myForm' onsubmit="return validate()">
                <table align="center">

                    <tr>
                        <td>
                            Select Category &nbsp;
                        </td>
                        <td>
                            <select name="drpdown_name">
                                <option>Select</option>
                                <option>Artist</option>
                                <option>User</option>                        
                            </select>
                        </td>
                    </tr>
                    <tr>
                        <td>
                            Name:
                        </td>
                        <td>
                            <input type="text" name="uname" >          
                        </td>
                    </tr>
                    <td>
                        Create Password:
                    </td>
                    <td>
                        <input type="password" name="upwd" >
                    </td>
                    <tr>
                        <td>
                            Confirm Password:
                        </td>
                        <td>
                            <input type="password" name="ucpwd" >
                        </td>
                    </tr>
                    <tr>
                        <td>
                            Email-Id:               
                        </td>
                        <td>
                            <input type="text" name="uemailId" onclick='validate()'>
                        </td>           
                    </tr>
                    <tr>
                        <td>
                            Country:
                        </td>
                        <td>
                            <select name="ucountry" width="50">
                                <option>India </option>
                                <option>Pakistan</option>
                                <option>Bangladesh</option>
                                <option>Japan</option>
                                <option>Canada</option>
                            </select>
                        </td>
                    </tr>
                    <tr>
                        <td>
                            <a href="login_ml.jsp">Login</a>
                        </td>
                        <td>
                            <input type="Submit" value="submit">
                        </td>
                    </tr>
                </table>
            </form>

答案 1 :(得分:1)

使用您的pandas方法,您可以致电

result.to_dict(orient='records')

将其作为词典列表取回。它不会将marks作为子字段放入,因为没有什么可以告诉它这样做。 physicschemistry只是与其他字段处于同一级别的字段。

您可能也遇到了问题,因为第一个列表中的name'cathy,第二个列表中为'kathy',这自然不会合并。

答案 2 :(得分:1)

考虑到你想要一个dicts列表作为输出,你可以轻松地做你想要的没有pandas,使用dict存储所有信息使用名称作为外键,对每个列表进行一次传递不像{{ 1}}你自己的代码中的双循环:

O(n^2)

输出:

out = {d["name"]: d for d in list1}
for d in list2:
    out[d.pop("name")]["marks"] = d


from pprint import pprint as pp

pp(list(out.values()))

如果您想创建新的词组,那么重复使用列表中的词汇:

[{'gender': 'female',
  'grade': 'second',
  'marks': {'chemistry': 69, 'physics': 78},
  'name': 'cathy'},
 {'gender': 'male',
  'grade': 'third',
  'marks': {'chemistry': 89, 'physics': 95},
  'name': 'john'}]

输出相同:

out = {d["name"]: d.copy() for d in list1}

for d in list2:
    k = d.pop("name")
    out[k]["marks"] = d.copy()

from pprint import pprint as pp

pp(list(out.values()))

答案 3 :(得分:1)

创建一个将添加marks列的函数,此列应包含physicschemistry标记的字典

def create_marks(df):
    df['marks'] = { 'chemistry' : df['chemistry'] , 'physics' : df['physics'] }
    return df

result_with_marks = result.apply( create_marks , axis = 1)

Out[19]:
gender  grade   name    chemistry   physics            marks
male    third   john    89             95   {u'chemistry': 89, u'physics': 95}
female  second  cathy   69             78   {u'chemistry': 69, u'physics': 78}

然后将其转换为您想要的结果,如下所示

result_with_marks.drop( ['chemistry' , 'physics'], axis = 1).to_dict(orient = 'records')

Out[20]:
[{'gender': 'male',
  'grade': 'third',
  'marks': {'chemistry': 89L, 'physics': 95L},
  'name': 'john'},
 {'gender': 'female',
  'grade': 'second',
  'marks': {'chemistry': 69L, 'physics': 78L},
  'name': 'cathy'}]