我正在努力寻找迭代Df的方法,并且对于每一行,应用迭代的定义来搜索Df1中最接近的匹配(目的是将数据从Df1添加到Df)。阅读并尝试了很多这里找到的方法,但没有获胜。我会很感激一些指针,特别是如果我走错了路线:
Df(示例) - 对于每一行,取日期时间(我已将其删除为索引),并在Df1中搜索最接近的时间匹配,然后添加(在本例中为Offshore_Umgeni_Deep at 09:03) Df [新专栏]'。
datetime cond06 temp03 pres07
0 2015-02-26 09:03:38.833000 49.448935 22.162381 10.909805
1 2015-02-26 09:03:39 50.098050 22.162781 10.885601
2 2015-02-26 09:03:39.167000 50.060446 22.164354 10.807413
3 2015-02-26 09:03:39.333000 50.239644 22.156575 10.788496
4 2015-02-26 09:03:39.500000 50.179168 22.160942 10.803082
DF1:
datetime S E Location_Name
0 2015-02-26 09:01:00 29.81192 31.04692 Offshore_Umgeni_Deep
1 2015-02-26 09:01:00 29.81176 31.04688 Offshore_Umgeni_Deep
2 2015-02-26 09:01:00 29.81159 31.04682 Offshore_Umgeni_Deep
3 2015-02-26 09:02:00 29.81140 31.04676 Offshore_Umgeni_Deep
4 2015-02-26 09:02:00 29.81127 31.04673 Offshore_Umgeni_Deep
5 2015-02-26 09:02:00 29.81116 31.04671 Offshore_Umgeni_Deep
6 2015-02-26 09:02:00 29.81110 31.04670 Offshore_Umgeni_Deep
7 2015-02-26 09:02:00 29.81109 31.04673 Offshore_Umgeni_Deep
8 2015-02-26 09:02:00 29.81107 31.04674 Offshore_Umgeni_Deep
9 2015-02-26 09:02:00 29.81105 31.04673 Offshore_Umgeni_Deep
10 2015-02-26 09:02:00 29.81103 31.04673 Offshore_Umgeni_Deep
11 2015-02-26 09:02:00 29.81103 31.04672 Offshore_Umgeni_Deep
12 2015-02-26 09:02:00 29.81103 31.04669 Offshore_Umgeni_Deep
13 2015-02-26 09:03:00 29.81102 31.04666 Offshore_Umgeni_Deep
14 2015-02-26 09:03:00 29.81103 31.04664 Offshore_Umgeni_Deep
15 2015-02-26 09:03:00 29.81104 31.04663 Offshore_Umgeni_Deep
16 2015-02-26 09:03:00 29.81105 31.04661 Offshore_Umgeni_Deep
17 2015-02-26 09:03:00 29.81106 31.04660 Offshore_Umgeni_Deep
18 2015-02-26 09:03:00 29.81107 31.04657 Offshore_Umgeni_Deep
19 2015-02-26 09:03:00 29.81109 31.04655 Offshore_Umgeni_Deep
20 2015-02-26 09:03:00 29.81110 31.04653 Offshore_Umgeni_Deep
21 2015-02-26 09:03:00 29.81111 31.04650 Offshore_Umgeni_Deep
22 2015-02-26 09:04:00 29.81113 31.04649 Offshore_Umgeni_Deep
23 2015-02-26 09:04:00 29.81114 31.04647 Offshore_Umgeni_Deep
24 2015-02-26 09:04:00 29.81116 31.04646 Offshore_Umgeni_Deep
25 2015-02-26 09:04:00 29.81117 31.04642 Offshore_Umgeni_Deep
26 2015-02-26 09:04:00 29.81118 31.04640 Offshore_Umgeni_Deep
我只能使用HH:MM,但我可能会错过数据,所以我一直在尝试使用最少的错误方法。经过多次尝试后,我认为最好用Df迭代通过Df1迭代Df:
def timesearch(df, df1):
for datetime, row in df1.datetime.iteritems():
if abs(df1['datetime'] - df['datetime']) < error:
return (df['Location_Name'] = df1['Location_Name'])
else:
return (df["Location_Name"] = None))
for datetime, row in df.datetime.iteritems():
def timesearch(df1.datetime,df2.datetime)
我知道上面的内容不太正确,因为在迭代它时需要知道行索引从Df1返回到Df。但希望我能够简明扼要地说出这个想法。