我有2个要基于列name
合并的数据帧。一个df中的name列具有缩写版本,而另一个df中的name列具有全名。更改列名以相互匹配的最有效方法是什么?
df1[names] = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
df2[names] = ["Manchester United", "Manchester City", "Chelsea FC", "Liverpool FC", "Tottenham Hotspurs", "Arsenal FC"]
答案 0 :(得分:2)
您可以使用dict(zip())
df1['names'] = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
df2['names'] = ["Manchester United", "Manchester City", "Chelsea FC", "Liverpool FC", "Tottenham Hotspurs", "Arsenal FC"]
d=dict(zip(df1['names'],df2['names'])) #created a mapping dictionary
print(d)
{'Man Utd': 'Manchester United',
'Man City': 'Manchester City',
'Chelsea': 'Chelsea FC',
'Liverpool': 'Liverpool FC',
'Spurs': 'Tottenham Hotspurs',
'Arsenal': 'Arsenal FC'}
然后通过
更改df1[names]
df1[names]=df1[names].map(d)
在此之后,您可以执行合并,因为列名现在相同。
答案 1 :(得分:1)
您可以实现此目标的唯一方法是维持引用it顺序以匹配两个名称列
df1 = pd.DataFrame()
referential = {
"Man Utd": "Manchester United",
"Man City": "Manchester City",
"Chelsea": "Chelsea FC",
"Liverpool": "Liverpool FC",
"Spurs": "Tottenham Hotspurs",
"Arsenal": "Arsenal FC"
}
df1['names'] = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
df1['names'] = df1['names'].map(referential)
print(df1)
答案 2 :(得分:1)
构造字典然后馈入pd.Series.map
是一种方法。但是,坚持使用熊猫,您还可以直接使用pd.Series.replace
:
lst1 = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
lst2 = ["Manchester United", "Manchester City", "Chelsea FC", "Liverpool FC",
"Tottenham Hotspurs", "Arsenal FC"]
# define input dictionary
df = pd.DataFrame({'names': lst1})
# replace values in lst1 by lst2, by index
df['names'] = df['names'].replace(lst1, lst2)
print(df)
names
0 Manchester United
1 Manchester City
2 Chelsea FC
3 Liverpool FC
4 Tottenham Hotspurs
5 Arsenal FC