使用python reduce函数动态创建查询?

时间:2019-05-06 12:11:19

标签: python reduce

当前,以下代码以如下方式动态创建查询:-

代码:

zip_cols = list(zip(['name','address'],
                    ['name_1','address_1']))


self.matches = self.features[
                            (
                                [
                                    reduce(
                                        lambda x, y: x + y,
                                        [self.features[a + "_" + c[0] + "_" + c[1]] for a in self._algos],
                                    )
                                    for c in zip_cols
                                ][0]
                                > (self.input_args.get('threshold', 0.7) * 4)
                            )
                            & (
                                [
                                    reduce(
                                        lambda x, y: x + y,
                                        [self.features[a + "_" + c[0] + "_" + c[1]] for a in self._algos],
                                    )
                                    for c in zip_cols
                                ][1]
                                > (self.input_args.get('threshold', 0.7) * 4)
                            )].copy()

查询:

matches = features[(
                    (
                       (features['fw_name_name_1'] / 100)  
                      + features['sw_name_name_1']
                      + features['jw_name_name_1']
                      + features['co_name_name_1']
                    )  > 2.8
                   ) 
                   & 
                    (
                       (
                        (features['fw_address_address_1'] / 100)  
                      + features['sw_address_address_1']
                      + features['jw_address_address_1']
                      + features['co_address_address_1']
                       ) > 2.8
                    )
           ].copy()

但是,如果source_compare_names中有2列且1个或2个以上失败,则此查询有效。

1 个答案:

答案 0 :(得分:0)

有了最少的输入和上下文,我就会开始学习。这样的想法是,您可以动态地将过滤条件建立为字符串,将其加入并进行评估。


threshold = self.input_args.get('threshold', 0.7) * 4
column_selection = [reduce(lambda x, y: x + y,
                           [self.features[a + "_" + c[0] + "_" + c[1]] for a in self._algos]) for c in zip_cols]

size = 10 # number of items you need 
total_filter_list = []
for i in range(size):
    # build the filter columns as list of strings
    total_filter_list.append(f'(column_selection[{i}] > {threshold})') 

# join the list of strings with '&', build the total filter criteria as string
total_filter_string = ' & '.join(total_filter_list)

# evaluate the filter
self.features[eval(total_filter_string)]