加入两个熊猫数据框(movie_genres.dat + user_ratedmovies.dat)

时间:2020-03-29 15:27:59

标签: python pandas dataframe join

我有一个任务来连接两个熊猫数据框。
1. https://github.com/timyitong/mf.recommendation/blob/master/data/hetrec2011-movielens-2k-v2/user_ratedmovies.dat
2. https://github.com/timyitong/mf.recommendation/blob/master/data/hetrec2011-movielens-2k-v2/movie_genres.dat
movie with ID 3 has 2 genres
任务是创建具有以下列的数据框:userID movieID评级类型-动作类型-动画...类型-西方。 ID为3的电影在genre-Comedy和genre-Romance列中的值应为1。我只设法削减了不必要的列,并将默认值分配给新类型的列。

import pandas as pd
mv_data = pd.read_table("movie_genres.dat")
ur_data = pd.read_table("user_ratedmovies.dat", usecols=['userID', 'movieID', 'rating'])
ur_data['genre-Action'] = 0
ur_data['genre-Adventure'] = 0
ur_data['genre-Animation'] = 0
ur_data['genre-Children'] = 0
ur_data['genre-Comedy'] = 0
ur_data['genre-Crime'] = 0
ur_data['genre-Documentary'] = 0
ur_data['genre-Drama'] = 0
ur_data['genre-Fantasy'] = 0
ur_data['genre-Film-Noir'] = 0
ur_data['genre-Horror'] = 0
ur_data['genre-IMAX'] = 0
ur_data['genre-Musical'] = 0
ur_data['genre-Mystery'] = 0
ur_data['genre-Romance'] = 0
ur_data['genre-Sci-Fi'] = 0
ur_data['genre-Short'] = 0
ur_data['genre-Thiller'] = 0
ur_data['genre-War'] = 0
ur_data['genre-Western'] = 0

print(ur_data)
print(mv_data)

output

0 个答案:

没有答案