我有两个数据框:
数据框1:
userId movieId rating timestamp
0 1 2 3.5 1112486027
1 1 29 3.5 1112484676
2 1 32 3.5 1112484819
3 1 47 3.5 1112484727
4 1 50 3.5 1112484580
数据框2:
movieId title genres
0 1 Toy Story (1995) Adventure|Animation|Children|Comedy
1 2 Jumanji (1995) Adventure|Children|Fantasy
2 3 Grumpier Old Men(1995) Comedy|Romance
3 4 Waiting to Exhale (1995) Comedy|Drama|Romance
4 5 Father of the Bride Part II (1995) Comedy
两个数据框的行数都不相同。我想用它们在数据帧2中表示的电影的名称替换数据帧1中的movieId数字。
我尝试了以下代码:
s = data2['title']
while i <= 131261:
array[i]= data2.index([data2['movieId'] == i])
i = i + 1
while pos<= len(array) - 1:
data3 = data['movieId'].replace([data['movieId'] == i],'[data2[pos]]')
但是它显示了以下错误:
TypeError Traceback (most recent call last)
<ipython-input-50-c6bed86d99a5> in <module>()
1 s = data2['title']
2 while i <= 131261:
----> 3 array[i]= data2.index([data2['movieId'] == i])
4 i = i + 1
5
TypeError: 'RangeIndex' object is not callable
我的错误是什么,有人可以提出更好的建议吗?
答案 0 :(得分:1)
由Series
使用map
:
df1['movieId'] = df1['movieId'].map(df2.set_index('movieId')['title'])