从熊猫的数据框行中删除 Na

时间:2021-07-07 16:02:09

标签: python pandas numpy na

如何使用 Pandas 在 df 输入中删除 NAN 以获得只有数组中的值的列表

**Input**
A |B |C 
-----------
x |y | NA
x |NA| NA
X |Y [ NA

输出

[[x,y],
 [X],
 [x,y]
]

它尝试传递每一行:

dataset.apply(lambda row: row[pd.notna(row)],axis=0).to_numpy()

array([["Belkin 325VA UPS Surge Protector, 6'",
        'Master Caster Door Stop, Large Neon Orange',
        'Easy-staple paper', 'Polycom VVX 310 VoIP phone',
        'Acco Banker\'s Clasps, 5 3/4"-Long',
        'Verbatim 25 GB 6x Blu-ray Single Layer Recordable Disc, 1/Pack',
        'Fellowes Advanced Computer Series Surge Protectors',
        'GBC DocuBind 200 Manual Binding Machine',
        'Tenex Personal Project File with Scoop Front Design, Black',
        'Avery Binding System Hidden Tab Executive Style Index Sets',
        'High Speed Automatic Electric Letter Opener', nan, nan, nan,
        nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
        nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],

你能解释一下最好的方法吗?

1 个答案:

答案 0 :(得分:0)

你可以试试:

out=df.agg(lambda x:list(x.dropna()),axis=1).tolist()
#you can also use apply() in place of agg() method
#If you need array then instead of tolist() use values attribute or to_numpy() method
out=df.agg(lambda x:list(x.dropna()),axis=1).values

out 的输出:

[['x', 'y'], ['x'], ['X', 'Y']]