Question

当前，以下代码以如下方式动态创建查询：-

代码：

zip_cols = list(zip(['name','address'],
                    ['name_1','address_1']))


self.matches = self.features[
                            (
                                [
                                    reduce(
                                        lambda x, y: x + y,
                                        [self.features[a + "_" + c[0] + "_" + c[1]] for a in self._algos],
                                    )
                                    for c in zip_cols
                                ][0]
                                > (self.input_args.get('threshold', 0.7) * 4)
                            )
                            & (
                                [
                                    reduce(
                                        lambda x, y: x + y,
                                        [self.features[a + "_" + c[0] + "_" + c[1]] for a in self._algos],
                                    )
                                    for c in zip_cols
                                ][1]
                                > (self.input_args.get('threshold', 0.7) * 4)
                            )].copy()

查询：

matches = features[(
                    (
                       (features['fw_name_name_1'] / 100)  
                      + features['sw_name_name_1']
                      + features['jw_name_name_1']
                      + features['co_name_name_1']
                    )  > 2.8
                   ) 
                   & 
                    (
                       (
                        (features['fw_address_address_1'] / 100)  
                      + features['sw_address_address_1']
                      + features['jw_address_address_1']
                      + features['co_address_address_1']
                       ) > 2.8
                    )
           ].copy()

但是，如果source_compare_names中有2列且1个或2个以上失败，则此查询有效。

Answer 1

有了最少的输入和上下文，我就会开始学习。这样的想法是，您可以动态地将过滤条件建立为字符串，将其加入并进行评估。


threshold = self.input_args.get('threshold', 0.7) * 4
column_selection = [reduce(lambda x, y: x + y,
                           [self.features[a + "_" + c[0] + "_" + c[1]] for a in self._algos]) for c in zip_cols]

size = 10 # number of items you need 
total_filter_list = []
for i in range(size):
    # build the filter columns as list of strings
    total_filter_list.append(f'(column_selection[{i}] > {threshold})') 

# join the list of strings with '&', build the total filter criteria as string
total_filter_string = ' & '.join(total_filter_list)

# evaluate the filter
self.features[eval(total_filter_string)]

使用python reduce函数动态创建查询？

1 个答案: