Python - 一个表中的查找值,该值位于第二个表的范围内

时间:2016-05-17 18:10:37

标签: python lookup arcpy

我有两个表,一个包含SCHEDULE_DATE(超过300,000条记录)和WORK_WEEK_CODE,第二个表包含WORK_WEEK_CODESTART_DATEEND_DATE 。第一个表具有重复的计划日期,第二个表是3,200个唯一值。我需要根据计划日期所在的范围,使用表2中的WORK_WEEK_CODE填充表{1}中的WORK_WEEK_CODE。两张表的样本如下。

我能够使用嵌套的arcpy.da.SearchCursor使用arcpy.da.UpdateCursor完成任务,但是使用记录量需要很长时间。任何关于更好(和更少时间)方法的建议都将受到高度赞赏。

注意:日期字段的格式为字符串

表1

SCHEDULE_DATE,WORK_WEEK_CODE  
20160219    
20160126    
20160219    
20160118    
20160221    
20160108    
20160129    
20160201    
20160214    
20160127

表2

WORK_WEEK_CODE,START_DATE,END_DATE  
1601,20160104,20160110  
1602,20160111,20160117  
1603,20160118,20160124  
1604,20160125,20160131  
1605,20160201,20160207  
1606,20160208,20160214  
1607,20160215,20160221

1 个答案:

答案 0 :(得分:0)

您可以使用Pandas dataframes作为更有效的方法。这是使用Pandas的方法。希望这会有所帮助:

    import pandas as pd

    # First you need to convert your data to Pandas Dataframe I read them from csv
    Table1 = pd.read_csv('Table1.csv')
    Table2 = pd.read_csv('Table2.csv')

    # Then you need to add a shared key for join
    Table1['key'] = 1
    Table2['key'] = 1

    #The following line joins the two tables
    mergeddf = pd.merge(Table1,Table2,how='left',on='key')

    #The following line converts the string dates to actual dates
    mergeddf['SCHEDULE_DATE'] = pd.to_datetime(mergeddf['SCHEDULE_DATE'],format='%Y%m%d')
    mergeddf['START_DATE'] = pd.to_datetime(mergeddf['START_DATE'],format='%Y%m%d')
    mergeddf['END_DATE'] = pd.to_datetime(mergeddf['END_DATE'],format='%Y%m%d')

    #The following line will filter and keep only lines that you need
    result = mergeddf[(mergeddf['SCHEDULE_DATE'] >= mergeddf['START_DATE']) & (mergeddf['SCHEDULE_DATE'] <= mergeddf['END_DATE'])]