我有2个数据框,第一个在“ station_anal”下方
count Start station number
index
31623 17105 31623
31258 11432 31258
31201 10194 31201
31200 9505 31200
31247 9145 31247
第二个数据帧“ vt”是:
Start station number Start station
0 31214 17th & Corcoran St NW
1 31104 Adams Mill & Columbia Rd NW
2 31221 18th & M St NW
3 31111 10th & U St NW
4 31260 23rd & E St NW
station_anal的尺寸为486x2
vt大小为8000x2
我的左连接命令是:
lj = pd.merge(station_anal, vt, how = 'left', on = 'Start station number')
dtypes对于两列都相同,即int64
但是lj返回:
lj.head()
count Start station number Start station
0 17105 31623 Columbus Circle / Union Station
1 17105 31623 Columbus Circle / Union Station
2 17105 31623 Columbus Circle / Union Station
3 17105 31623 Columbus Circle / Union Station
4 17105 31623 Columbus Circle / Union Station
大小为8000x3
毫无意义,因为我的理解是左连接,在这种情况下,结果矩阵行大小始终是第一个数据帧486
答案 0 :(得分:0)
让我们使用地图:
station_anal ['起始站号'] = station_anal ['起始站号']
.map(vt.set_index('起始站Numer')['起始站'])
更新放置的重复项,然后映射:
mapper = vt.drop_duplicates('Start Station Number')\
.set_index('Start station number')['Start station']
station_anal['Start Station'] = station_anal['Start station number']\
.map(mapper)