Dataframe未获得更新的Pandas

时间:2017-10-24 12:19:24

标签: python pandas

import pandas as pd
import numpy as np
titanic= pd.read_csv("C:\\Users\\Shailesh.Rana\\Downloads\\train.csv")
title=[] #to extract titles out of names
for i in range(len(titanic)):
title.append(titanic.loc[:,"Name"].iloc[i].split(" ")[1]) #index 1 is title
titanic.iloc[(np.array(title)=="Master.")&(np.array(titanic.Age.isnull()))].loc[:,"Age"]=3.5 
#values with master title and NAN as age

最后一行不会对原始数据集进行更改。实际上,如果我再次运行此行,它仍会显示一个包含4个NaN值的系列。

1 个答案:

答案 0 :(得分:0)

str.splitstr[1]一起用于选择第二个list

也无需转换为numpy array,也应删除iloc

titanic = pd.DataFrame({'Name':['John Master.','Joe','Mary Master.'],
                        'Age':[10,20,np.nan]})

titanic.loc[(titanic.Name.str.split().str[1]=="Master.") &(titanic.Age.isnull()) ,"Age"]=3.5

print (titanic)
    Age          Name
0  10.0  John Master.
1  20.0           Joe
2   3.5  Mary Master.