在下面的例子中,我可以让合并正常运行,但是我怎么没有第二个索引打印呢?我是否必须添加单独的代码行:
df_merge = df_merge.drop(columns='cities')
我无法选择要合并到左侧数据集的列?如果df2有30列而我只想要10列呢?
import pandas as pd
df1 = pd.DataFrame({
"city": ['new york','chicago', 'orlando','ottawa'],
"humidity": [35,69,79,99]
})
df2 = pd.DataFrame({
"cities": ['new york', 'chicago', 'toronto'],
"temp": [1, 6, -35]
})
df_merge = df1.merge(df2, left_on='city', right_on='cities', how='left')
print(df_merge)
**output**
index city humidity cities temp
0 0 new york 35 new york 1.0
1 1 chicago 69 chicago 6.0
2 2 orlando 79 NaN NaN
3 3 ottawa 99 NaN NaN
答案 0 :(得分:4)
merge
首先更改列的名称
df1.merge(df2.rename(columns={'cities': 'city'}), 'left')
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN
如果您需要明确说明您要合并的内容:
df1.merge(df2.rename(columns={'cities': 'city'}), how='left', on='city')
join
首先设置右侧的索引
'left'
是默认值。
df1.join(df2.set_index('cities'), 'city')
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN
map
制作字典。
df1.assign(temp=df1.city.map(dict(df2.values)))
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN
不太可爱,更明确
df1.assign(temp=df1.city.map(dict(df2.set_index('cities').temp)))
答案 1 :(得分:2)
set_index
并指定
df1=df1.set_index('city');df2=df2.set_index('cities')
df1['temp']=df2.temp
df1.reset_index()
Out[595]:
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN