movie_rating_T.iloc [:,5:6]
critic Toby
title
Just My Luck NaN
Lady in the Water NaN
Snakes on a Plane 4.5
Superman Returns 4.0
The Night Listener NaN
You Me and Dupree 1.0
假设我只想选择Nan列
Just My Luck
Lady in the Water
The Night Listener
如何使用dataframe nan仅提取nan值?
critic Toby
title
Just My Luck NaN
Lady in the Water NaN
The Night Listener NaN
。['title']无效
=============================================== ================ movie_rating_T.iloc [:,5:6]
critic Toby
title
Just My Luck NaN
Lady in the Water NaN
Snakes on a Plane 4.5
Superman Returns 4.0
The Night Listener NaN
You Me and Dupree 1.0
df_MovieRatingT [df_MovieRatingT [ '托比']。ISNULL()]
critic Toby
title
Just My Luck NaN
Lady in the Water NaN
The Night Listener NaN
=============================================== =============== df = DataFrame(评级)
critic title rating
0 Jack Matthews Lady in the Water 3.0
1 Jack Matthews Snakes on a Plane 4.0
2 Jack Matthews You Me and Dupree 3.5
3 Jack Matthews Superman Returns 5.0
我想成功
critic Claudia Puig Gene Seymour Jack Matthews Lisa Rose Mick LaSalle Toby
title
Just My Luck 3.0 1.5 NaN 3.0 2.0 NaN
Lady in the Water NaN 3.0 3.0 2.5 3.0 NaN
Snakes on a Plane 3.5 3.5 4.0 3.5 4.0 4.5
Superman Returns 4.0 5.0 5.0 3.5 3.0 4.0
The Night Listener 4.5 3.0 3.0 3.0 3.0 NaN
You Me and Dupree 2.5 3.5 3.5 2.5 2.0 1.0
我用过
movie_rating= ratings.pivot(index='critic', columns='title',values='rating')
但它在同一专栏创建了标题和评论家。 如何解决?
答案 0 :(得分:1)
您可以使用isnull
来使用pandasdf[df['You column with NaN'].isnull()]
这将返回具有NaN
的行df2 = df[df['You column with NaN'].isnull()]['Title']
将返回您想要的内容,
一个例子:
import pandas as pd
import numpy as np
df = pd.DataFrame([range(3), [0, np.NaN, np.NaN], [0, 0, np.NaN], range(3), range(3)], columns=["Col_1", "Col_2", "Col_3"])
print df
Col_1 Col_2 Col_3
0 0 1.0 2.0
1 0 NaN NaN
2 0 0.0 NaN
3 0 1.0 2.0
4 0 1.0 2.0
print df[df['Col_3'].isnull()]
Col_1 Col_2 Col_3
1 0 NaN NaN
2 0 0.0 NaN
df2 =df[df['Col_3'].isnull()]['Col_2']
print df2
1 NaN
2 0.0
Name: Col_2, dtype: float64
我现在遇到了你的问题,主要问题是数据框本身。使用pivot时,column参数错误...
但您不需要解决此问题。
如果我没错,现在你只需要评论家和电影,而不是评级本身。
df_Toby = df.loc[df['critic'] == 'Toby']
这个df ['crit'] =='Toby'将选择所有具有评论名称的行
要返回标题,您可以选择“标题”列
df_Toby = df_Toby['title']
将标题和评级分组
df_Toby = df_Toby[['title', 'rating']]
你可以在那之后使用
exclude_Nan_df_Toby = df_Toby.dropna()
这将排除所有具有NaN的行,并仅返回具有有效评级的行。
干杯,