应用错误收集

好吧，我有以下代码：

# group by Sex, Pclass, and Title
grouped = titanic.groupby(['Sex','Pclass', 'Title'])
# view the median Age by the grouped features
grouped.Age.median()

grouped.describe(include="all")
# apply the grouped median value on the Age NaN
titanic.Age = grouped.Age.apply(lambda x: x.fillna(x.median()))

这是基于kaggle的泰坦尼克号比赛，此代码将“年龄”填充为我们拥有的Pclass，性别和头衔年龄分组的中位数。

这是我的问题：如果要使用交叉验证，应该如何填写它们。我的意思是，使用交叉验证时，我们必须记住，仅应将火车值用于插值，但我不知道是否使用管道将仅使用火车值或所有值。

谢谢！

在与熊猫一起使用交叉验证时，如何用我自己的算法填充缺失值

0 个答案: