Starting dataframe:
pd.DataFrame({'col1': ['one', 'None', 'None'], 'col2': ['None', 'None', 'six'], 'col3': ['None', 'eight', 'None']})
End goal:
pd.DataFrame({'col4': ['one', 'eight', 'six']})
What I tried to do:
df['col1'].map(str)+df['col2'].map(str)+df['col3'].map(str)
How can I merge multiple pandas column object type values into one column while ignoring "None" values? By the way, in this dataset, there will never end up being more than one value in the final dataframe cells.
答案 0 :(得分:4)
You have string None
s, not actual null values, so you'll need to replace them first.
Option 1
replace
/mask
/where
+ fillna
+ agg
df.replace('None', np.nan).fillna('').agg(''.join, axis=1).to_frame('col4')
Or,
df.mask(df.eq('None')).fillna('').agg(''.join, axis=1).to_frame('col4')
Or,
df.where(df.ne('None')).fillna('').agg(''.join, axis=1).to_frame('col4')
col4
0 one
1 eight
2 six
Option 2
replace
+ pd.notnull
v = df.replace('None', np.nan).values.ravel()
pd.DataFrame(v[pd.notnull(v)], columns=['col4'])
col4
0 one
1 eight
2 six
Option 3
A solution leveraging Divakar's excellent justify
function:
pd.DataFrame(justify(df.values, invalid_val='None')[:, 0], columns=['col4'])
col4
0 one
1 eight
2 six
Reference
(Note, you will need to modify the function slightly to play nicely with string data.)
def justify(a, invalid_val=0, axis=1, side='left'):
"""
Justifies a 2D array
Parameters
----------
A : ndarray
Input array to be justified
axis : int
Axis along which justification is to be made
side : str
Direction of justification. It could be 'left', 'right', 'up', 'down'
It should be 'left' or 'right' for axis=1 and 'up' or 'down' for axis=0.
"""
if invalid_val is np.nan:
mask = ~np.isnan(a)
else:
mask = a!=invalid_val
justified_mask = np.sort(mask,axis=axis)
if (side=='up') | (side=='left'):
justified_mask = np.flip(justified_mask,axis=axis)
out = np.full(a.shape, invalid_val, dtype='<U8') # change to be made is here
if axis==1:
out[justified_mask] = a[mask]
else:
out.T[justified_mask.T] = a.T[mask.T]
return out
答案 1 :(得分:4)
Another way, for the sake of giving you options:
pd.DataFrame(df[df!='None'].stack().values, columns=['col4'])
col4
0 one
1 eight
2 six
答案 2 :(得分:3)
Or
df[df!='None'].fillna('').sum(1)
Out[1054]:
0 one
1 eight
2 six
dtype: object
With list
map
list(map(lambda x : ''.join(x) ,df.replace({'None':''}).values))
Out[1061]: ['one', 'eight', 'six']
答案 3 :(得分:0)
df['col4']=df.apply(lambda x: x.max(),axis=1)