Question

我从一个文件加载30000行并将它们加载到pandas数据帧中。在将它们放入数据帧之前，我计算它们，它们是30000，我将它们加载到数据帧中，并再次计算它们，我仍然得到相同的数字。但是因为缺少值，我在一些列上放置了apply函数来处理int或float的缺失值和不可解析的值。在那之后我再次数他们，我发现我只剩下近3000。我的问题是，apply方法是否删除了引用条件中未指定的行。？

这是我添加它们的代码

print(counter)
users=pd.DataFrame(users,columns=names.split('\t'))
print(users.shape[0])
user_ids=set(list(map(int,user_ids)))
users['id']=users['id'].astype('int64')
users['career_level']=users['career_level'].apply(lambda c:0 if c=='NULL' or c==''  else int(c))
users['discipline_id']=users['discipline_id'].apply(lambda c:0 if c=='NULL' or c==''  else int(c))
users['industry_id']=users['industry_id'].apply(lambda c:0 if c=='NULL' or c=='' else int(c))
users['experience_years_experience']=users['experience_years_experience'].apply(lambda c:0 if c=='NULL' or c=='' else int(c))
users['experience_n_entries_class']=users['experience_n_entries_class'].apply(lambda c:0 if c=='NULL' or c=='' else  int(c))
users['experience_years_in_current']=users['experience_years_in_current'].apply(lambda c:0 if c=='NULL' or c==''  else int(c))
users['edu_degree']=users['edu_degree'].apply(lambda c:0 if c=='NULL' or c==''  else int(c))
users=users[users['id'].isin(user_ids)]
print(users.shape[0])

这是结果

30000
30000
2926

前30k是对象用户的计数器第二个30k是数据帧的大小并且2k是在使用函数apply之后。

如果你能告诉我发生了什么，我会非常感激。

在应用不删除行

0 个答案: