我有一个数据框df1,它具有:
F_Id I_Code F_Date
FT-56832 2 01/09/2019
FT-93828 1 01/09/2019
FT-13853 2 02/09/2019
FT-18858 3 02/09/2019
FT-19010 2 03/09/2019
FT-62064 5 02/09/2019
FT-94494 4 03/09/2019
FT-73594 2 03/09/2019
FT-78590 3 01/09/2019
FT-14296 4 01/09/2019
FT-82529 3 03/09/2019
FT-33266 3 04/09/2019
FT-58456 4 02/09/2019
FT-16693 4 04/09/2019
FT-69073 4 02/09/2019
FT-69649 1 05/09/2019
对于每个(I_code,F_Date),有5个不同的ID与之关联。
我还有另一个具有以下列的数据框df2:
F_Date num_i_found
01/09/2019 5
01/09/2019 3
02/09/2019 5
02/09/2019 5
03/09/2019 3
02/09/2019 4
03/09/2019 4
03/09/2019 5
01/09/2019 5
01/09/2019 4
03/09/2019 3
04/09/2019 5
02/09/2019 4
04/09/2019 5
02/09/2019 4
05/09/2019 4
我想在df2中生成一个新列ID_found,使其具有ID的数组。
例如,对于1/09/2019 num_i_found为4,则ID_found将是df1中5中的4个ID。 (FT-56832,FT-93828,F-78590等)。
是否有实现相同目标的方法。
答案 0 :(得分:0)
创建列表字典并通过num_i_found
值建立索引进行过滤:
通知:如果值不匹配(例如第一行),则示例数据中的值仅是4
的{{1}}个值,我想实际数据中全部是5个值01/09/2019
中的每个日期时间,因此可以按需工作。
d
d = df1.groupby('F_Date')['F_Id'].apply(list).to_dict()
print (d)
{'01/09/2019': ['FT-56832', 'FT-93828', 'FT-78590', 'FT-14296'],
'02/09/2019': ['FT-13853', 'FT-18858', 'FT-62064', 'FT-58456', 'FT-69073'],
'03/09/2019': ['FT-19010', 'FT-94494', 'FT-73594', 'FT-82529'],
'04/09/2019': ['FT-33266', 'FT-16693'],
'05/09/2019': ['FT-69649']}
如果需要字符串:
df2['new'] = df2.apply(lambda x: d.get(x['F_Date'], [])[:x['num_i_found']], axis=1)
print (df2)
F_Date num_i_found new
0 01/09/2019 5 [FT-56832, FT-93828, FT-78590, FT-14296]
1 01/09/2019 3 [FT-56832, FT-93828, FT-78590]
2 02/09/2019 5 [FT-13853, FT-18858, FT-62064, FT-58456, FT-69...
3 02/09/2019 5 [FT-13853, FT-18858, FT-62064, FT-58456, FT-69...
4 03/09/2019 3 [FT-19010, FT-94494, FT-73594]
5 02/09/2019 4 [FT-13853, FT-18858, FT-62064, FT-58456]
6 03/09/2019 4 [FT-19010, FT-94494, FT-73594, FT-82529]
7 03/09/2019 5 [FT-19010, FT-94494, FT-73594, FT-82529]
8 01/09/2019 5 [FT-56832, FT-93828, FT-78590, FT-14296]
9 01/09/2019 4 [FT-56832, FT-93828, FT-78590, FT-14296]
10 03/09/2019 3 [FT-19010, FT-94494, FT-73594]
11 04/09/2019 5 [FT-33266, FT-16693]
12 02/09/2019 4 [FT-13853, FT-18858, FT-62064, FT-58456]
13 04/09/2019 5 [FT-33266, FT-16693]
14 02/09/2019 4 [FT-13853, FT-18858, FT-62064, FT-58456]
15 05/09/2019 4 [FT-69649]