我提供了两个数据帧。一个包含不同校区食品类型的学校食品评级。第一个df是学生评分,第二个是教师评分。结果的顺序和df的长度无法保证。多数民众赞成说,我需要将两者结合在一起。
import pandas as pd
student_ratings = pd.DataFrame({'food': ['chinese', 'mexican', 'american', 'chinese', 'mexican', 'american'],
'campus': [37, 37, 37, 25, 25, 25],
'student_rating': [97, 90, 83, 96, 89, 82]})
teacher_ratings = pd.DataFrame({'food': ['chinese', 'mexican', 'american', 'chinese', 'mexican', 'american', 'chinese', 'mexican', 'american'],
'campus': [25, 25, 25, 37, 37, 37, 45, 45, 45],
'teacher_rating': [87, 80, 73, 86, 79, 72, 67, 62, 65]})
#...
# SOMETHING LIKE WHAT I'M AFTER...
combined_ratings = pd.DataFrame({'food': ['chinese', 'mexican', 'american', 'chinese', 'mexican', 'american', 'chinese', 'mexican', 'american'],
'campus': [25, 25, 25, 37, 37, 37, 45, 45, 45],
'student_rating': [96, 89, 82, 97, 90, 83, Nan, NaN, NaN],
'teacher_rating': [87, 80, 73, 86, 79, 72, 67, 62, 65]})
我基本上想要添加列(可能有多个列),但我需要按food
和campus
答案 0 :(得分:2)
好像你需要一个外部合并:
res = pd.merge(student_ratings, teacher_ratings, how='outer')
print(res)
campus food student_rating teacher_rating
0 37 chinese 97.0 86
1 37 mexican 90.0 79
2 37 american 83.0 72
3 25 chinese 96.0 87
4 25 mexican 89.0 80
5 25 american 82.0 73
6 45 chinese NaN 67
7 45 mexican NaN 62
8 45 american NaN 65