我有两个很难合并的数据框:
const qs = require('qs')
const url = `https://api.routexl.nl/tour`;
const locations = this.makeLocations(tasks,trip);
const params = qs.stringify({
skipOptimisation: true,
locations:locations
})
const request = new Request(url, {
method: 'POST',
headers: {
'Authorization': 'Basic authToken',
'Content-Type': 'application/x-www-form-urlencoded'
},
body: params
})
const response = await fetch(request)
console.info("Response:", response)
输出:
df1 = pd.DataFrame({'id': [ ["001", "001"], ["001"], ["007", "001"]]})
和
id
0 [001, 001]
1 [001]
2 [007, 001]
输出:
df2 = pd.DataFrame({'id': [ "001", "007"],'name': ['Name01', 'Name02']})
我想到达的是这个
id name
0 001 Name01
1 007 Name02
输出:
df3 = pd.DataFrame({'id': [ ["001", "001"], ["001"], ["007", "01"]],
'name': [ ['Name01','Name01'], ['Name01'], ['Name02', 'Name01']]})
我的问题是我可以合并,但是我无法以所需的格式输入。我现在所拥有的是这里:
id name
0 [001, 001] [Name01, Name01]
1 [001] [Name01]
2 [007, 01] [Name02, Name01]
输出:
pd.DataFrame(df2.merge(df1.explode('id'), on= 'id')).groupby('id').agg(lambda x: x.tolist())
答案 0 :(得分:3)
在列表理解中使用mapping
创建的字典使用df2
,它应该比explode
更快,并聚合list
,这是真实数据中的最佳测试。
d = df2.set_index('id')['name'].to_dict()
df1['name'] = [[d[y] for y in x if y in d] for x in df1['id']]
print (df1)
id name
0 [001, 001] [Name01, Name01]
1 [001] [Name01]
2 [007, 001] [Name02, Name01]
答案 1 :(得分:3)
我们可以做explode
+ merge
df1=df1.explode('id').reset_index().merge(df2,how='left').groupby('index').agg(list)
id name
index
0 [001, 001] [Name01, Name01]
1 [001] [Name01]
2 [007, 001] [Name02, Name01]
或者只是map
并分配
df1['name']=df1.id.explode().map(df2.set_index('id').name).groupby(level=0).agg(list)
0 [Name01, Name01]
1 [Name01]
2 [Name02, Name01]
Name: id, dtype: object