How to merge multiple pandas column object type values into one column while ignoring "None"?

时间:2018-03-25 19:29:37

标签: python pandas

Starting dataframe:

pd.DataFrame({'col1': ['one', 'None', 'None'], 'col2': ['None', 'None', 'six'], 'col3': ['None', 'eight', 'None']})

enter image description here

End goal:

pd.DataFrame({'col4': ['one', 'eight', 'six']})

enter image description here

What I tried to do:

df['col1'].map(str)+df['col2'].map(str)+df['col3'].map(str)

enter image description here

How can I merge multiple pandas column object type values into one column while ignoring "None" values? By the way, in this dataset, there will never end up being more than one value in the final dataframe cells.

4 个答案:

答案 0 :(得分:4)

You have string Nones, not actual null values, so you'll need to replace them first.

Option 1
replace/mask/where + fillna + agg

df.replace('None', np.nan).fillna('').agg(''.join, axis=1).to_frame('col4')

Or,

df.mask(df.eq('None')).fillna('').agg(''.join, axis=1).to_frame('col4')

Or,

df.where(df.ne('None')).fillna('').agg(''.join, axis=1).to_frame('col4')

    col4
0    one
1  eight
2    six

Option 2
replace + pd.notnull

v = df.replace('None', np.nan).values.ravel()
pd.DataFrame(v[pd.notnull(v)], columns=['col4'])

    col4
0    one
1  eight
2    six

Option 3
A solution leveraging Divakar's excellent justify function:

pd.DataFrame(justify(df.values, invalid_val='None')[:, 0], columns=['col4'])

    col4
0    one
1  eight
2    six

Reference
(Note, you will need to modify the function slightly to play nicely with string data.)

def justify(a, invalid_val=0, axis=1, side='left'):    
    """
    Justifies a 2D array

    Parameters
    ----------
    A : ndarray
        Input array to be justified
    axis : int
        Axis along which justification is to be made
    side : str
        Direction of justification. It could be 'left', 'right', 'up', 'down'
        It should be 'left' or 'right' for axis=1 and 'up' or 'down' for axis=0.

    """

    if invalid_val is np.nan:
        mask = ~np.isnan(a)
    else:
        mask = a!=invalid_val
    justified_mask = np.sort(mask,axis=axis)
    if (side=='up') | (side=='left'):
        justified_mask = np.flip(justified_mask,axis=axis)
    out = np.full(a.shape, invalid_val, dtype='<U8')    # change to be made is here
    if axis==1:
        out[justified_mask] = a[mask]
    else:
        out.T[justified_mask.T] = a.T[mask.T]
    return out

答案 1 :(得分:4)

Another way, for the sake of giving you options:

pd.DataFrame(df[df!='None'].stack().values, columns=['col4'])

    col4
0    one
1  eight
2    six

答案 2 :(得分:3)

Or

df[df!='None'].fillna('').sum(1)
Out[1054]: 
0      one
1    eight
2      six
dtype: object

With list map

list(map(lambda x : ''.join(x) ,df.replace({'None':''}).values))
Out[1061]: ['one', 'eight', 'six']

答案 3 :(得分:0)

df['col4']=df.apply(lambda x: x.max(),axis=1)