我正在尝试使用唯一的ID为仅存在于一个数据集中的客户端合并两个数据集。我已经为每个全名分配了唯一的ID作为字典,但是每个人都是唯一的,即使他们的名字相同。我需要将每个唯一的ID迭代地分配给该人姓名的每个实例。
字典示例:
{'Corey Davis': {'names_id':[1472]}, 'Jose Hernandez': {'names_id': [3464,15202,82567,98472]}, ...}
我已经尝试过使用.map()函数以及
referrals['names_id'] = referrals['full_name'].copy()
for key, val in m.items():
referrals.loc[referrals.names_id == key, 'names_id'] = val
但是,当然,它只会分配遇到的最后一个值98472。
我希望有以下类似的东西
full_name names_id \
Corey Davis 1472
Jose Hernandez 3464
Jose Hernandez 15202
Jose Hernandez 82657
Jose Hernandez 98472
但我知道
full_name names_id \
Corey Davis 1472
Jose Hernandez 98472
Jose Hernandez 98472
Jose Hernandez 98472
Jose Hernandez 98472
答案 0 :(得分:0)
我个人想做的是:
inputs = [{'full_name':'test', 'names_id':[1]}, {'full_name':'test2', 'names_id':[2,3,4]}]
# Create list of dictionaries for each 'entry'
entries = []
for input in inputs:
for name_id in input['names_id']:
entries.append({'full_name': input['full_name'], 'names_id': name_id})
# Now you have a list of dicts - each being one line of your table
# entries is now
# [{'full_name': 'test', 'names_id': 1},
# {'full_name': 'test2', 'names_id': 2},
# {'full_name': 'test2', 'names_id': 3},
# {'full_name': 'test2', 'names_id': 4}]
# I like pandas and use it for its dataframes, you can create a dataframe from list of dicts
import pandas as pd
final_dataframe = pd.DataFrame(entries)