Question

我试图用同一列（这是分类文本列）的模式填充另一列（这也是分类文本列）。

请注意，克里希纳·班达哈维（Krishna Bandhakavi）在9月23日下午3:05发表了类似的问题，但对我的情况没有帮助。

尝试1：

df['col2'] = df.groupby('col1')['col2'].apply(lambda x: x.fillna(x.mode()))
    df_cpt['col2'].unique()

没有错误，但在下面给出并观察到nan仍然存在。

array(['Passed', nan, 'Pending', 'Registered', 'Applied for sign Off',
       'Verbal', 'Not Applicable'], dtype=object)

尝试2：

df['col2'] = df.groupby('col1')['col2'].apply(lambda x: x.fillna(x.mode().index[0))

给出以下错误：

IndexError：索引0超出了大小为0的轴0的边界

df
--

col1   col2   col3   col4
----   ----   ----   ----
Brod   Pass   xxx    xxx
PSTN   InP    xxx    xxx
LL     InP    xxx    xxx
Fibr   NaN    xxx    xxx
Brod   Pass   xxx    xxx
PSTN   NaN    xxx    xxx
LL     InP    xxx    xxx
Fibr   Pass   xxx    xxx
Brod   NaN    xxx    xxx
PSTN   InP    xxx    xxx
LL     InP    xxx    xxx
Fibr   InP    xxx    xxx
Brod   Pass   xxx    xxx
PSTN   Pass   xxx    xxx
LL     InP    xxx    xxx
Fibr   Pass   xxx    xxx

df['col2'] = df.groupby('col1')['col2'].apply(lambda x: x.fillna(x.mode()))
df_cpt['col2'].unique()

output: 
array(['Passed', nan, 'Pending', 'Registered', 'Applied for sign Off',
       'Verbal', 'Not Applicable'], dtype=object)

df['col2'] = df.groupby('col1')['col2'].apply(lambda x: x.fillna(x.mode().index[0))

输出：

IndexError：索引0超出了大小为0的轴0的边界

期望nan应该随着列的模式消失。请帮忙。我无法在其中粘贴整个df，因为它很大。

如何用熊猫中的另一个文本分类列填充相同列组的模式

0 个答案: