创建新变量,其值应与python中的imdbvotes和imdbvotes之间的差平方成正比

时间:2019-12-17 05:16:21

标签: python pandas

这是我的代码,用于计算imdbRatingimdbVotes之间的平方差

imdb_data['imdbVotes'] = imdb_data['imdbVotes'].astype(int)  
imdb_data['imdbRating'] = imdb_data['imdbRating'].astype(int)

imdb_data['new'] = imdb_data['imdbRating'] - imdb_data['imdbVotes']

这是我在Python 3.7.0 + pandas 0.23.4中遇到的错误:

  

TypeError:字符串索引必须为整数

({imdb_data是一个数据框,引用的列名确实存在)

1 个答案:

答案 0 :(得分:1)

imdbRating, imdbVotes的数据类型为float。因此,将它们从float转换为string。然后进行计算。

imdb_data = pd.read_csv('IMDB_data.csv', sep=',',encoding = 'ISO-8859-1')
imdb_data['imdbRating'] = pd.to_numeric(imdb_data['imdbRating'], errors='coerce', downcast='float')
imdb_data['imdbVotes'] = pd.to_numeric(imdb_data['imdbVotes'], errors='coerce', downcast='float')
imdb_data['new'] = imdb_data['imdbRating'] - imdb_data['imdbVotes']
imdb_data.head()

输出为 output as image

    Plot    Title   imdbVotes   Poster  imdbRating  Genre   imdbID  Year    Language    new
0   Despite his tarnished reputation after the eve...   The Dark Knight Rises   2679.0  http://ia.media-imdb.com/images/M/MV5BMTk4ODQz...   75.0    Action, Thriller    tt1345836   2012    English     -2604.0
1   0   0   0.0     0   0.0     0   0   0   0   0.0
2   Based on the novel written by Stephen Chbosky,...   The Perks of Being a Wallflower     1270.0  http://ia.media-imdb.com/images/M/MV5BMzIxOTQy...   71.0    Drama, Romance  tt1659337   2012    English     -1199.0
3   Mike Lane is a thirty-year old living in Tampa...   Magic Mike  2580.0  http://ia.media-imdb.com/images/M/MV5BMTQzMDMz...   51.0    Comedy, Drama   tt1915581   2012    English     -2529.0
4   When Bond's latest assignment goes gravely wro...   Skyfall     1807.0  http://ia.media-imdb.com/images/M/MV5BMjAyODkz...   68.0    Action, Thriller    tt1074638   2012    English     -1739.0