Question

我正在尝试创建一个新列，其中包含过去非空列的所有条目的列表。

我希望能够产生所需的列而不必遍历每一行。

  col1   col2   col3   output       
  a      NaN    b      [a,b]        
  c      d      e      [c,d,e]      
  f      g      NaN    [f,g]

任何帮助将不胜感激。

Answer 1

使用DataFrame.agg呼叫dropna和tolist：

df.agg(lambda x: x.dropna().tolist(), axis=1)

0       [a, b]
1    [c, d, e]
2       [f, g]
dtype: object

如果您需要用逗号分隔的字符串，请使用str.cat或str.join：

df.agg(lambda x: x.dropna().str.cat(sep=','), axis=1)
# df.agg(lambda x: ','.join(x.dropna()), axis=1)

0      a,b
1    c,d,e
2      f,g
dtype: object

如果性能很重要，我建议使用列表理解：

df['output'] = [x[pd.notna(x)].tolist() for x in df.values]
df

  col1 col2 col3     output
0    a  NaN    b     [a, b]
1    c    d    e  [c, d, e]
2    f    g  NaN     [f, g]

这行得通，因为您的DataFrame由字符串组成。有关何时适用于熊猫的循环的更多信息，请参见以下讨论：For loops with pandas - When should I care?

Answer 2

使用循环

df['New']=[[y for y in x if y == y ] for x in df.values.tolist()]
df
Out[654]: 
  col1 col2 col3        New
0    a  NaN    b     [a, b]
1    c    d    e  [c, d, e]
2    f    g  NaN     [f, g]

或将stack与groupby一起使用

df['New']=df.stack().groupby(level=0).agg(list)
df
Out[659]: 
  col1 col2 col3        New
0    a  NaN    b     [a, b]
1    c    d    e  [c, d, e]
2    f    g  NaN     [f, g]

Answer 3

尝试一下：

df['output'] = df.apply(lambda x: x.dropna().to_list(), axis=1)

如何将DataFrame的列的非空条目合并到新列中？

3 个答案: