我有一个任务来连接两个熊猫数据框。
1. https://github.com/timyitong/mf.recommendation/blob/master/data/hetrec2011-movielens-2k-v2/user_ratedmovies.dat
2. https://github.com/timyitong/mf.recommendation/blob/master/data/hetrec2011-movielens-2k-v2/movie_genres.dat
movie with ID 3 has 2 genres
任务是创建具有以下列的数据框:userID movieID评级类型-动作类型-动画...类型-西方。 ID为3的电影在genre-Comedy和genre-Romance列中的值应为1。我只设法削减了不必要的列,并将默认值分配给新类型的列。
import pandas as pd
mv_data = pd.read_table("movie_genres.dat")
ur_data = pd.read_table("user_ratedmovies.dat", usecols=['userID', 'movieID', 'rating'])
ur_data['genre-Action'] = 0
ur_data['genre-Adventure'] = 0
ur_data['genre-Animation'] = 0
ur_data['genre-Children'] = 0
ur_data['genre-Comedy'] = 0
ur_data['genre-Crime'] = 0
ur_data['genre-Documentary'] = 0
ur_data['genre-Drama'] = 0
ur_data['genre-Fantasy'] = 0
ur_data['genre-Film-Noir'] = 0
ur_data['genre-Horror'] = 0
ur_data['genre-IMAX'] = 0
ur_data['genre-Musical'] = 0
ur_data['genre-Mystery'] = 0
ur_data['genre-Romance'] = 0
ur_data['genre-Sci-Fi'] = 0
ur_data['genre-Short'] = 0
ur_data['genre-Thiller'] = 0
ur_data['genre-War'] = 0
ur_data['genre-Western'] = 0
print(ur_data)
print(mv_data)