熊猫:将数据从两个数据帧移动到具有元组索引的另一个

时间:2019-11-13 16:24:02

标签: python python-3.x pandas dataframe

我有以下三个数据框:

final_df

library(tidyverse)

dfb %>%
 mutate(gene_name_list = str_split(gene_name, "; ")) %>%
 mutate(gene_of_interest = map_lgl(gene_name_list, some, ~ . %in% dfa$gene_name)) %>%
 filter(gene_of_interest == TRUE) %>%
 select(gene_name, id)

ref_df

                                other   ref
(2014-12-24 13:20:00-05:00, a)  NaN     NaN
(2014-12-24 13:40:00-05:00, b)  NaN     NaN
(2018-07-03 14:00:00-04:00, d)  NaN     NaN

other_df

                                a   b   c   d
2014-12-24 13:20:00-05:00       1   2   3   4
2014-12-24 13:40:00-05:00       2   3   4   5
2017-11-24 13:10:00-05:00       ..............
2018-07-03 13:25:00-04:00       ..............
2018-07-03 14:00:00-04:00       9   10  11  12
2019-07-03 13:10:00-04:00       ..............

我需要将final_df中的NaN值替换为相关数据框,如下所示:

                                a   b   c   d
2014-12-24 13:20:00-05:00       10  20  30  40
2014-12-24 13:40:00-05:00       20  30  40  50
2017-11-24 13:10:00-05:00       ..............
2018-07-03 13:20:00-04:00       ..............
2018-07-03 13:25:00-04:00       ..............
2018-07-03 14:00:00-04:00       90  100 110 120
2019-07-03 13:10:00-04:00       ..............

我如何得到它?

2 个答案:

答案 0 :(得分:2)

pandas.DataFrame.lookup

final_df['ref'] = ref_df.lookup(*zip(*final_df.index))
final_df['other'] = other_df.lookup(*zip(*final_df.index))

mapget

当您缺少位时

final_df['ref'] = list(map(ref_df.stack().get, final_df.index))
final_df['other'] = list(map(other_df.stack().get, final_df.index))

演示

设置

idx = pd.MultiIndex.from_tuples([(1, 'a'), (2, 'b'), (3, 'd')])
final_df = pd.DataFrame(index=idx, columns=['other', 'ref'])
ref_df = pd.DataFrame([
    [ 1,  2,  3,  4],
    [ 2,  3,  4,  5],
    [ 9, 10, 11, 12]
], [1, 2, 3], ['a', 'b', 'c', 'd'])
other_df = pd.DataFrame([
    [ 10,  20,  30,  40],
    [ 20,  30,  40,  50],
    [ 90, 100, 110, 120]
], [1, 2, 3], ['a', 'b', 'c', 'd'])

print(final_df, ref_df, other_df, sep='\n\n')

    other  ref
1 a   NaN  NaN
2 b   NaN  NaN
3 d   NaN  NaN

   a   b   c   d
1  1   2   3   4
2  2   3   4   5
3  9  10  11  12

    a    b    c    d
1  10   20   30   40
2  20   30   40   50
3  90  100  110  120

结果

final_df['ref'] = ref_df.lookup(*zip(*final_df.index))
final_df['other'] = other_df.lookup(*zip(*final_df.index))

final_df

     other  ref
1 a     10    1
2 b     30    3
3 d    120   12

答案 1 :(得分:0)

可以解决ref_dfother_df中缺少日期的另一种解决方案:

index = pd.MultiIndex.from_tuples(final_df.index)
ref = ref_df.stack().rename('ref')
other = other_df.stack().rename('other')

result = pd.DataFrame(index=index).join(ref).join(other)
相关问题