如何从Pandas DataFrame中提取索引/列/数据基于逻辑运算?

时间:2016-07-26 21:13:53

标签: python pandas dataframe

我有以下数据框:

import numpy as np
import pandas as pd
data = np.random.rand(5,5)
df = pd.DataFrame(data, index = list('abcde'), columns = list('ABCDE'))
df = df[df>0]
df
          A         B         C         D   E
a       NaN  2.038740  1.371158       NaN NaN
b  0.575567       NaN  0.462007       NaN NaN
c  0.984802  0.049818  0.129836       NaN NaN
d       NaN       NaN       NaN       NaN NaN
e  0.789563  1.846402       NaN  0.340902 NaN

我想得到非NAN数据的所有(index,col_name,value)。我该怎么做?

我的预期结果是:

[('b','A', 0.575567), ('c', 'A', 0.984802), ('e', 'A', 0.789563),...]

1 个答案:

答案 0 :(得分:4)

您可以堆叠数据框,这将自动删除NA值,然后将索引重置为列,之后将很容易转换为元组列表:

[tuple(r) for r in df.stack().reset_index().values]

# [('a', 'B', 2.03874),
#  ('a', 'C', 1.371158),
#  ('b', 'A', 0.575567),
#  ('b', 'C', 0.46200699999999995),
#  ('c', 'A', 0.9848020000000001),
#  ('c', 'B', 0.049818),
#  ('c', 'C', 0.12983599999999998),
#  ('e', 'A', 0.789563),
#  ('e', 'B', 1.846402),
#  ('e', 'D', 0.340902)]

或使用数据框“to_records()方法:

list(df.stack().reset_index().to_records(index = False))